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NUCLEIC ACIDS AND PROTEINS FROM STREPTOCOCCUS PNEUMONIAE 

The present invention relates to proteins derived from Streptococcus pneumoniae, 
nucleic acid molecules encoding such proteins, the use of the nucleic acid and/or 
proteins as antigens/immunogens and in detection/diagnosis, as well as methods for 
5 screening the proteins/nucleic acid sequences as potential anti-microbial targets. 

Streptococcus pneumoniae , commonly referred to as the pneumococcus, is an 
important pathogenic organism. The continuing significance of Streptoccocus 
pneumoniae infections in relation to human disease in developing and developed 

10 countries has been authoritatively reviewed (Fiber, G.R., Science, 265: 1385-1387 
(1994)). That indicates that on a global scale this organism is believed to be the 
most common bacterial cause of acute respiratory infections, and is estimated to 
result in 1 million childhood deaths each year, mostly in developing countries 
(Stansfield, S.K., Pediatr. Infect. Dis., 6: 622 (1987)). In the USA it has been 

15 suggested (Breiman et al, Arch. Intern. Med., 150: 1401 (1990)) that the 
pneumococcus is still the most common cause of bacterial pneumonia, and that 
disease rates are particularly high in young children, in the elderly, and in patients 
with predisposing conditions such as asplenia, heart, lung and kidney disease, 
diabetes, alcoholism, or with immunosupressive disorders, especially AIDS. These 

20 groups are at higher risk of pneumococcal septicaemia and hence meningitis and 
therefore have a greater risk of dying from pneumococcal infection. The 
pneumococcus is also the leading cause of otitis media and sinusitis, which remain 
prevalent infections in children in developed countries, and which incur substantial 
costs. 

25 

The need for effective preventative strategies against pneumococcal infection is 
highlighted by the recent emergence of penicillin-resistant pneumococci. It has been 
reported that 6.6% of pneumoceal isolates in 13 US hospitals in 12 states were found 
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to be resistant to penicillin and some isolates were also resistant to other antibiotics 
including third generation cyclosporins (Schappert, S.M., Vital and Health Statistics 
of the Centres for Disease Control/National Centre for Health Statistics, 214:1 
(1992)). The rates of penicillin resistance can be higher (up to 20%) in some 
hospitals (Breiman et al, J. Am. Med. Assoc., 271: 1831 (1994)). Since the 
development of penicillin resistance among pneumococci is both recent and sudden, 
coming after decades during which penicillin remained an effective treatment, these 
findings are regarded as alarming. 



10 For the reasons given above, there are therefore compelling grounds for considering 
improvements in the means of preventing, controlling, diagnosing or treating 
pneumococcal diseases. 



Various approaches have been taken in order to provide vaccines for the prevention 
15 of pneumococcal infections. Difficulties arise for instance in view of the variety of 
serotypes (at least 90) based on the structure of the polysaccharide capsule 
surrounding the organism. Vaccines against individual serotypes are not effective 
against other serotypes and this means that vaccines must include polysaccharide 
antigens from a whole range of serotypes in order to be effective in a majority of 
20 cases. An additional problem arises because it has been found that the capsular 

polysaccharides (each of which determines the serotype and is the major protective 
antigen) when purified and used as a vaccine do not reliably induce protective 
antibody responses in children under two years of age, the age group which suffers 
the highest incidence of invasive pneumococcal infection and meningitis. 



A modification of the approach using capsule antigens relies on conjugating the 
polysaccharide to a protein in order to derive an enhanced immune response, 
particularly by giving the response T-cell dependent character. This approach has 
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been used in the development of a vaccine against Haemophilus influenzae, for 
instance. There are, however, issues of cost concerning both the multi- 
poly saccharide vaccines and those based on conjugates. 

A third approach is to look for other antigenic components which offer the potential 
to be vaccine candidates. This is the basis of the present invention. Using a specially 
developed bacterial expression system, we have been able to identify a group of 
protein antigens from pneomococcus which are associated with the bacterial 
envelope or which are secreted. 

Thus, in a first aspect the present invention provides a Streptococcus pneumoniae 
protein or polypeptide having a sequence selected from those shown in table 1 . 

In a second aspect, the present invention provides a Streptococcus pneumoniae 
protein or polypeptide having a sequence selected from those shown in table 2. 

A protein or polypeptide of the present invention may be provided in substantially pure 
form. For example, it may be provided in a form which is substantially free of other 
proteins. 

As discussed herein, the proteins and polypeptides of the invention are useful as 
antigenic material. Such material can be "antigenic" and/or "immunogenic". 
Generally, "antigenic" is taken to mean that the protein or polypeptide is capable of 
being used to raise antibodies or indeed is capable of inducing an antibody response in 
a subject. "Immunogenic" is taken to mean that the protein or polypeptide is capable of 
eliciting a protective immune response in a subject. Thus, in the latter case, the protein 
or polypeptide may be capable of not only generating an antibody response but, in 
addition, a non-antibody based immune response. 
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The skilled person will appreciate that homologues or derivatives of the proteins or 
polypeptides of the invention will also find use in the context of the present invention, 
ie as antigenic/immunogenic material. Thus, for instance proteins or polypeptides 
5 which include one or more additions, deletions, substitutions or the like are 
encompassed by the present invention. In addition, it may be possible to replace one 
amino acid with another of similar "type". For instance replacing one hydrophobic 
amino acid with another. 

One can use a program such as the CLUSTAL program to compare amino acid 
10 sequences. This program compares amino acid sequences and finds the optimal 
alignment by inserting spaces in either sequence as appropriate. It is possible to 
calculate amino acid identity or similarity (identity plus conservation of amino acid 
type) for an optimal alignment. A program like BLASTx will align the longest stretch 
of similar sequences and assign a value to the fit. It is thus possible to obtain a 
15 comparison where several regions of similarity are found, each having a different 
score. Both types of identity analysis are contemplated in the present invention. 

In the case of homologues and derivatives, the degree of identity with a protein or 
polypeptide as described herein is less important than that the homologue or derivative 

20 should retain the antigenicity or immunogenicity of the original protein or polypeptide. 
However, suitably, homologues or derivatives having at least 60% similarity (as 
discussed above) with the proteins or polypeptides described herein are provided. 
Preferably, homologues or derivatives having at least 70% similarity, more preferably 
at least 80% similarity are provided. Most preferably, homologues or derivatives 

25 having at least 90% or even 95% similarity are provided. 

In an alternative approach, the homologues or derivatives could be fusion proteins, 
incorporating moieties which render purification easier, for example by effectively 
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tagging the desired protein or polypeptide. It may be necessary to remove the "tag" or 
it may be the case that the fusion protein itself retains sufficient antigenicity to be 
useful. 

5 In an additional aspect of the invention there are provided antigenic/immunogenic 
fragments of the proteins or polypeptides of the invention, or of homologues or 
derivatives thereof. 

For fragments of the proteins or polypeptides described herein, or of homologues or 
10 derivatives thereof, the situation is slightly different. It is well known that is possible to 
screen an antigenic protein or polypeptide to identify epitopic regions, ie those regions 
which are responsible for the protein or polypeptide's antigenicity or immunogenicity. 
Methods for carrying out such screening are well known in the art. Thus, the fragments 
of the present invention should include one or more such epitopic regions or be 
15 sufficiently similar to such regions to retain their antigenic/immunogenic properties. 
Thus, for fragments according to the present invention the degree of identity is perhaps 
irrelevant, since they may be 100% identical to a particular part of a protein or 
polypeptide, homologue or derivative as described herein. The key issue, once again, is 
that the fragment retains the antigenic/immunogenic properties. 

20 

Thus, what is important for homologues, derivatives and fragments is that they possess 
at least a degree of the antigenicity/immunogenicity of the protein or polypeptide from 
which they are derived. 

25 Gene cloning techniques may be used to provide a protein of the invention in 

substantially pure form. These techniques are disclosed, for example, in J. Sambrook 
et al Molecular Cloning 2nd Edition, Cold Spring Harbor Laboratory Press (1989). 
Thus, in a third aspect, the present invention provides a nucleic acid molecule 
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comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 1 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 1 . 

In a fourth aspect the present invention provides a nucleic acid molecule comprising or 
consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 2 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); or 
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(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 2. 

The nucleic acid molecules of the invention may include a plurality of such sequences, 
and/or fragments. The skilled person will appreciate that the present invention can 
include novel variants of those particular novel nucleic acid molecules which are 
exemplified herein. Such variants are encompassed by the present invention. These 
may occur in nature, for example because of strain variation. For example, additions, 
substitutions and/or deletions are included. In addition, and particularly when utilising 
microbial expression systems, one may wish to engineer the nucleic acid sequence by 
making use of known preferred codon usage in the particular organism being used for 
expression. Thus, synthetic or non-naturally occurring variants are also included within 
the scope of the invention. 

The term "RNA equivalent" when used above indicates that a given RNA molecule has 
a sequence which is complementary to that of a given DNA molecule (allowing for the 
fact that in RNA "U" replaces "T" in the genetic code). 

When comparing nucleic acid sequences for the purposes of determining the degree of 
homology or identity one can use programs such as BESTFIT and GAP (both from the 
Wisconsin Genetics Computer Group (GCG) software package) BESTFIT, for 
example, compares two sequences and produces an optimal alignment of the most 
similar segments. GAP enables sequences to be aligned along their whole length and 
finds the optimal alignment by inserting spaces in either sequence as appropriate. 
Suitably, in the context of the present invention when discussing identity of nucleic acid 
sequences, the comparison is made by alignment of the sequences along their whole 
length. 
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Preferably, sequences which have substantial identity have at least 50% sequence 
identity, desirably at least 75% sequence identity and more desirably at least 90 or at 
least 95% sequence identity with said sequences. In some cases the sequence identity 
may be 99% or above/ 

Desirably, the term "substantial identity" indicates that said sequence has a greater 
degree of identity with any of the sequences described herein than with prior art nucleic 
acid sequences. 

It should however be noted that where a nucleic acid sequence of the present invention 
codes for at least part of a novel gene product the present invention includes within its 
scope all possible sequence coding for the gene product or for a novel part thereof. 

The nucleic acid molecule may be in isolated or recombinant form. It may be 
incorporated into a vector and the vector may be incorporated into a host. Such vectors 
and suitable, hosts form yet further aspects of the present invention. 

Therefore, for example, by using probes based upon the nucleic acid sequences 
provided herein, genes in Streptococcus pneumoniae can be identified. They can then 
be excised using restriction enzymes and cloned into a vector. The vector can be 
introduced into a suitable host for expression. 

Nucleic acid molecules of the present invention may be obtained from S.pneumoniae by 
the use of appropriate probes complementary to part of the sequences of the nucleic 
acid molecules. Restriction enzymes or sonication techniques can be used to obtain 
appropriately sized fragments for probing. 
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Alternatively PCR techniques may be used to amplify a desired nucleic acid sequence. 
Thus the sequence data provided herein can be used to design two primers for use in 
PCR so that a desired sequence, including whole genes or fragments thereof, can be 
targeted and then amplified to a high degree. 

5 

Typically primers will be at least 15-25 nucleotides long. 

As a further alternative chemical synthesis may be used. This may be automated. 
Relatively short sequences may be chemically synthesised and ligated together to 
10 provide a longer sequence. 

There is another group of proteins from S. pneumoniae which have been identified 
using the bacterial expression system described herein. These are known proteins 
from S. pneumoniae, which have not previously been identified as antigenic proteins. 

15 The amino acid sequences of this group of proteins, together with DNA sequences 
coding for them are shown in Table 3. These proteins, or homologues, derivatives 
and/or fragments thereof also find use as antigens/immunogens. Thus, in another 
aspect the present invention provides the use of a protein or polypeptide having a 
sequence selected from those shown in Tables 1-3, or homologues, derivatives 

20 and/or fragments thereof, as an immunogen/antigen. 

In yet a further aspect the present invention provides an immunogenic/antigenic 
composition comprising one or more proteins or polypeptides selected from those 
whose sequences are shown in Tables 1-3, or homologues or derivatives thereof, 
25 and/or fragments of any of these. In preferred embodiments, the 

immunogenic/antigenic composition is a vaccine or is for use in a diagnostic assay. 

In the case of vaccines suitable additional excipients, diluents, adjuvants or the like 
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may be included. Numerous examples of these are well known in the art. 

It is also possible to utilise the nucleic acid sequences shown in Tables 1-3 in the 
preparation of so-called DNA vaccines. Thus, the invention also provides a vaccine 
5 composition comprising one or more nucleic acid sequences as defined herein. DNA 
vaccines are described in the art (see for instance, Donnelly el al , Ann. Rev. 
Immunol., 15:617-648 (1997)) and the skilled person can use such art described 
techniques to produce and use DNA vaccines according to the present invention. 

10 As already discussed herein the proteins or polypeptides described herein, their 

homologues or derivatives, and/or fragments of any of these, can be used in methods 
of detecting/diagnosing S. pneumoniae. Such methods can be based on the detection 
of antibodies against such proteins which may be present in a subject. Therefore the 
present invention provides a method for the detection/diagnosis of S.pneumoniae 

15 which comprises the step of bringing into contact a sample to be tested with at least 
one protein, or homologue, derivative or fragment thereof, as described herein. 
Suitably, the sample is a biological sample, such as a tissue sample or a sample of 
blood or saliva obtained from a subject to be tested. 

20 In an alternative approach, the proteins described herein, or homologues, derivatives 
and/or fragments thereof, can be used to raise antibodies, which in turn can be used 
to detect the antigens, and hence S.pneumoniae. Such antibodies form another aspect 
of the invention. Antibodies within the scope of the present invention may be 
monoclonal or polyclonal. 

25 

Polyclonal antibodies can be raised by stimulating their production in a suitable animal 
host (e.g. a mouse, rat, guinea pig, rabbit, sheep, goat or monkey) when a protein as 
described herein, or a homologue, derivative or fragment thereof, is injected into the 
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animal. If desired, an adjuvant may be administered together with the protein. Well- 
known adjuvants include Freund's adjuvant (complete and incomplete) and aluminium 
hydroxide. The antibodies can then be purified by virtue of their binding to a protein as 
described herein. 

5 

Monoclonal antibodies can be produced from hybridomas. These can be formed by 
fusing myeloma cells and spleen cells which produce the desired antibody in order to 
form an immortal cell line. Thus the well-known Kohler & Milstein technique {Nature 
256 (1975)) or subsequent variations upon this technique can be used. 

10 

Techniques for producing monoclonal and polyclonal antibodies that bind to a 
particular polypeptide/protein are now well developed in the art. They are discussed in 
standard immunology textbooks, for example in Roitt et al, Immunology second edition \ 
(1989), Churchill Livingstone, London. 

15 

In addition to whole antibodies, the present invention includes derivatives thereof which 
are capable of binding to proteins etc as described herein. Thus the present invention 
includes antibody fragments and synthetic constructs. Examples of antibody fragments 
and synthetic constructs are given by Dougall et al in Tibtech 12 372-379 (September 
20 1994). 

Antibody fragments include, for example, Fab, F(ab') 2 and Fv fragments. Fab 
fragments (These are discussed in Roitt et al [supra] ). Fv fragments can be modified 
to produce a synthetic construct known as a single chain Fv (scFv) molecule. This 
25 includes a peptide linker covalently joining V h and Vj regions, which contributes to the 
stability of the molecule. Other synthetic constructs that can be used include CDR 
peptides. These are synthetic peptides comprising antigen-binding determinants. 
Peptide mimetics may also be used. These molecules are usually conformationally 
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restricted organic rings that mimic the structure of a CDR loop and that include 
antigen- interactive side chains. 

Synthetic constructs include chimaeric molecules. Thus, for example, humanised (or 
primatised) antibodies or derivatives thereof are within the scope of the present 
invention. An example of a humanised antibody is an antibody having human 
framework regions, but rodent hypervariable regions. Ways of producing chimaeric 
antibodies are discussed for example by Morrison et al in PNAS, 81, 6851-6855 (1984) 
and by Takeda et al in Nature. 314, 452-454 (1985). 

Synthetic constructs also include molecules comprising an additional moiety that 
provides the molecule with some desirable property in addition to antigen binding. For 
example the moiety may be a label (e.g. a fluorescent or radioactive label). 
Alternatively, it may be a pharmaceutical^ active agent. 

Antibodies, or derivatives thereof, find use in detection/diagnosis of S .pneumoniae. 
Thus, in another aspect the present invention provides a method for the 
detection/diagnosis of S.pneumoniae which comprises the step of bringing into contact 
a sample to be tested and antibodies capable of binding to one or more proteins 
described herein, or to homologues, derivatives and/or fragments thereof. 

In addition, so-called "Affibodies" may be utilised. These are binding proteins 
selected from combinatorial libraries of an alpha-helical bacterial receptor domain 
(Nord et al , ) Thus, Small protein domains, capable of specific binding to different 
target proteins can be selected using combinatorial approaches. 

It will also be clear that the nucleic acid sequences described herein may be used to 
detect/diagnose S.pneumoniae. Thus, in yet a further aspect, the present invention 
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provides a method for the detection/diagnosis of S. pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one nucleic acid 
sequence as described herein. Suitably, the sample is a biological sample, such as a 
tissue sample or a sample of blood or saliva obtained from a subject to be tested. 
Such samples may be pre-treated before being used in the methods of the invention. 
Thus, for example, a sample may be treated to extract DNA. Then, DNA probes 
based on the nucleic acid sequences described herein (ie usually fragments of such 
sequences) may be used to detect nucleic acid from S. pneumoniae. 

In additional aspects, the present invention provides: 

(a) a method of vaccinating a subject against S.pneumoniae which comprises the 
step of administering to a subject a protein or polypeptide of the invention, or a 
derivative, homologue or fragment thereof, or an immunogenic composition of the 
invention; 

(b) a method of vaccinating a subject against S.pneumoniae which comprises the 
step of administering to a subject a nucleic acid molecule as defined herein;. 

(c) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a protein or polypeptide of the 
invention, or a derivative, homologue or fragment thereof, or an immunogenic 
composition of the invention; 

(d) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a nucleic acid molecule as defined 
herein; 
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(e) a kit for use in detecting/diagnosing S. pneumoniae infection comprising one 
or more proteins or polypeptides of the invention, or homologues, derivatives or 
fragments thereof, or an antigenic composition of the invention; and 

5 (f) a kit for use in detecting/diagnosing S. pneumoniae infection comprising one 
or more nucleic acid molecules as defined herein. 

Given that we have identified a group of important proteins, such proteins are 
potential targets for anti-microbial therapy. It is necessary, however, to determine 
10 whether each individual protein is essential for the organism's viability. Thus, the 
present invention also provides a method of determining whether a protein or 
polypeptide as described herein represents a potential anti-microbial target which 
comprises antagonising, inhibiting or otherwise interfering with the function or 
expression of said protein and determining whether S. pneumoniae is still viable. 

15 

A suitable method for inactivating the protein is to effect selected gene knockouts, ie 
prevent expression of the protein and determine whether this results in a lethal 
change. Suitable methods for carrying out such gene knockouts are described in Li 
et al , P.N.A.S., 94:13251-13256 (1997) and Kolkman et al , 178:3736- 
20 3741 (1996). 

In a final aspect the present invention provides the use of an agent capable of 
antagonising, inhibiting or otherwise interfering with the function or expression of a 
protein or polypeptide of the invention in the manufacture of a medicament for use in 
25 the treatment or prophylaxis of S. pneumoniae infection. 

As mentioned above, we have used a bacterial expression system as a means of 
identifying those proteins which are surface associated, secreted or exported and 
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thus, would find use as antigens. 



The information necessary for the secretion/export of proteins has been extensively 
studied in bacteria. In the majority of cases, protein export requires a signal peptide 
5 to be present at the N-terminus of the precursor protein so that it becomes directed to 
the translocation machinery on the cytoplasmic membrane. During or after 
translocation, the signal peptide is removed by a membrane associated signal 
peptidase. Ultimately the localization of the protein (i.e. whether it be secreted, an 
integral membrane protein or attached to the cell wall) is determined by sequences 
10 other than the leader peptide itself. 

We are specifically interested in surface located or exported proteins as these are 
likely to be antigens for use in vaccines, as diagnostic reagents or as targets for 
therapy with novel chemical entities. We have therefore developed a screening 

15 vector-system in Lactococcus lactis that permits genes encoding exported proteins to 
be identified and isolated. We provide below a representative example showing how 
given novel surface associated proteins from Streptococcus pneumoniae have been 
identified and characterized. The screening vector incorporates the staphylococcal 
nuclease gene nuc lacking its own export signal as a secretion reporter. 

20 Staphylococcal nuclease is a naturally secreted heat-stable, monomeric enzyme 
which has been efficiently expressed and secreted in a range of Gram positive 
bacteria (Shortle, Gene, 22:181-189 (1983); Kovacevic et aL, 7. BacterioL, 
162:521-528 (1985); Miller etaL, J. BacterioL, 169:3508-3514 (1987); Liebl et aL, 
7. BacterioL, 174:1854-1861 (1992); Le Loir et aL, J. BacterioL, 176:5135-5139 

25 (1994); Poquet etaL, J. BacterioL, 180:1904-1912 (1998)). 

Recently, Poquet et al. ((1998), supra) have described a screening vector 
incorporating the nuc gene lacking its own signal leader as a reporter to identify 
exported proteins in Gram positive bacteria, and have applied it to L. lactis. This 
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vector (pFUN) contains the pAMpi replicon which functions in a broad host range 
of Gram-positive bacteria in addition to the ColEl replicon that promotes replication 
in Escherichia coli and certain other Gram negative bacteria. Unique cloning sites 
present in the vector can be used to generate transcriptional and translational fusions 
5 between cloned genomic DNA fragments and the open reading frame of the 

truncated nuc gene devoid of its own signal secretion leader. The nuc gene makes an 
ideal reporter gene because the secretion of nuclease can readily be detected using a 
simple and sensitive plate test: Recombinant colonies secreting the nuclease develop 
a pink halo whereas control colonies remain white (Shortle, (1983), supra; Le Loir 
10 etal, (1994), supra). 

Thus, the invention will now be described with reference to the following 
representative example, which provides details of how the proteins, polypeptides and 
nucleic acid sequences described herein identified as antigenic targets. 

We describe herein the construction of three reporter vectors and their use in L. 
lactis to identify and isolate genomic DNA fragments from Streptococcus 
pneumoniae encoding secreted or surface associated proteins. 

The invention will now be described with reference to the examples, which should 
20 not be construed as in any way limiting the invention. The examples refer to the 
figures in which: 

Figure 1: shows the results of a number of DNA vaccine trials; and 
25 Figure 2: shows the results of further DNA vaccine trials. 

EXAMPLE 1 

(i) Construction of the pTREPl-nuc series of reporter vectors 



30 
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(a) Construction of expression plasmid pTREPl 

The pTREPl plasmid is a high-copy number (40-80 per cell) theta-replicating gram 
positive plasmid, which is a derivative of the pTREX plasmid which is itself a 
5 derivative of the previously published pIL253 plasmid. pIL253 incorporates the 

broad Gram-positive host range replicon of pAMpi (Simon and Chopin, Biochimie, 
70:559-567 (1988)) and is non-mobilisable by the L lactis sex-factor. pIL253 also 
lacks the tra function which is necessary for transfer or efficient mobilisation by 
conjugative parent plasmids exemplified by pIL501. The Enterococcal pAMf31 

10 replicon has previously been transferred to various species including Streptococcus, 
Lactobacillus and Bacillus species as well as Clostridium acetobutylicum, (Oultram 
and Klaenhammer, FEMS Microbiological Letters, 27:129-134 (1985); Gibson et 
al., (1979); LeBlanc et aL, Proceedings of the National Academy of Science USA, 
75:3484-3487 (1978)) indicating the potential broad host range utility. The pTREPl 

15 plasmid represents a constitutive transcription vector. 

The pTREX vector was constructed as follows. An artificial DNA fragment 
containing a putative RNA stabilising sequence, a translation initiation region (TIR), 
a multiple cloning site for insertion of the target genes and a transcription terminator 

20 was created by annealing 2 complementary oligonucleotides and extending with Tfl 
DNA polymerase. The sense and anti-sense oligonucleotides contained the 
recognition sites for Nhel and BamHI at their 5* ends respectively to facilitate 
cloning. This fragment was cloned between the Xbal and BamHI sites in 
pUC19NT7, a derivative of pUC19 which contains the T7 expression cassette from 

25 pLETl (Wells et al , 7. Appl. BacterioL y 74:629-636 (1993)) cloned between the 
EcoRI and Hindlll sites. The resulting construct was designated pUCLEX. The 
complete expression cassette of pUCLEX was then removed by cutting with Hindlll 
and blunting followed by cutting with EcoRI before cloning into EcoRI and SacI 
(blunted) sites of pIL253 to generate the vector pTREX (Wells and Schofield, In 
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Current advances in metabolism, genetics and applications-NATO ASI Series, H 
98:37-62 (1996)). The putative RNA stabilising sequence and TIR are derived from 
the Escherichia coli T7 bacteriophage sequence and modified at one nucleotide 
position to enhance the complementarity of the Shine Dalgarno (SD) motif to the 
5 ribosomal 16s RNA of Lactococcus lactis (Schofield et al. pers. corns. University of 
Cambridge Dept. Pathology.). 

A Lactococcus lactis MG1363 chromosomal DNA fragment exhibiting promoter 
activity which was subsequently designated P7 was cloned between the EcoRI and 
10 Bglll sites present in the expression cassette, creating pTREX7. This active 

promoter region had been previously isolated using the promoter probe vector 
pSB292 (Waterfield et al, Gene, 165:9-15 (1995)). The promoter fragment was 
amplified by PCR using the Vent DNA polymerase according to the manufacturer. 

15 The pTREPl vector was then constructed as follows. An artificial DNA fragment 
which included a transcription terminator, the forward pUC sequencing primer, a 
promoter multiple -cloning site region and a universal translation stop sequence was 
created by annealing two overlapping partially complementary synthetic 
oligonucleotides together and extending with sequenase according to manufacturers 

20 instructions. The sense and anti-sense (pTREPF and pTREPR) oligonucleotides 

contained the recognition sites for EcoRV and BamHI at their 5' ends respectively to 
facilitate cloning into pTREX7. The transcription terminator was that of the Bacillus 
penicillinase gene, which has been shown to be effective in Lactococcus (Jos et al. , 
Applied and Environmental Microbiology, 50:540-542 (1985)). This was considered 

25 necessary as expression of target genes in the pTREX vectors was observed to be 

leaky and is thought to be the result of cryptic promoter activity in the origin region 
(Schofield et al. pers. corns. University of Cambridge Dept. Pathology.). The 
forward pUC primer sequencing was included to enable direct sequencing of cloned 
DNA fragments. The translation stop sequence which encodes a stop codon in 3 
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different frames was included to prevent translational fusions between vector genes 
and cloned DNA fragments. The pTREX7 vector was first digested with EcoRI and 
blunted using the 5' - 3' polymerase activity of T4 DNA polymerase (NEB) - 
according to manufacturer's instructions. The EcoRI digested and blunt ended 
pTREX7 vector was then digested with Bgl II thus removing the P7 promoter. The 
artificial DNA fragment derived from the annealed synthetic oligonucleotides was 
then digested with EcoRV and Bam HI and cloned into the EcoRI(blunted)-Bgl II 
digested pTREX7 vector to generate pTREP. A Lactococcus lactis MG1363 
chromosomal promoter designated PI was then cloned between the EcoRI and Bglll 
sites present in the pTREP expression cassette forming pTREPl . This promoter was 
also isolated using the promoter probe vector pSB292 and characterised by 
Waterfield et aL, (1995), supra. The PI promoter fragment was originally 
amplified by PCR using vent DNA polymerase according to manufacturers 
instructions and cloned into the pTREX as an EcoRI-Bglll DNA fragment. The 
EcoRI-Bglll PI promoter containing fragment was removed from pTREXl by 
restriction enzyme digestion and used for cloning into pTREP (Schofield et al. pers. 
corns. University of Cambridge, Dept. Pathology.). 

(b) PCR amplification of the 5. aureus nuc gene . 

The nucleotide sequence of the 5. aureus nuc gene (EMBL database accession 
number V01281) was used to design synthetic oligonucleotide primers for PCR 
amplification. The primers were designed to amplify the mature form of the nuc 
gene designated nucA which is generated by proteolytic cleavage of the N-terminal 
19 to 21 amino acids of the secreted propeptide designated Snase B (Shortle, (1983), 
supra). Three sense primers (nucSl, nucS2 and nucS3, Appendix 1) were designed, 
each one having a blunt-ended restriction endonuclease cleavage site for EcoRV or 
Smal in a different reading frame with respect to the nuc gene. Additionally Bglll 
and BamHI were incorporated at the 5* ends of the sense and anti-sense primers 
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respectively to facilitate cloning into BamHI and Bglll cut pTREPL The sequences 
of all the primers are given in Appendix 1 . Three nuc gene DN A fragments 
encoding the mature form of the nuclease gene (NucA) were amplified by PCR using 
each of the sense primers combined with the anti-sense primer described above. The 
nuc gene fragments were amplified by PCR using 5. aureus genomic DNA template, 
Vent DNA Polymerase (NEB) and the conditions recommended by the 

manufacturer. An initial denaturation step at 93 °C for 2 min was followed by 30 

cycles of denaturation at 93 °C for 45 sec, annealing at 50 °C for 45 seconds, and 

extension at 73 °C for 1 minute and then a final 5 min extension step at 73 °C. The 
PCR amplified products were purified using a Wizard clean up column (Promega) to 
remove unincorporated nucleotides and primers. 

(c) Construction of the pTREPl-nuc vectors 

The purified nuc gene fragments described in section b were digested with Bgl II and 
BamHI using standard conditions and ligated to BamHI and Bglll cut and 
dephosphorylated pTREPl to generate the pTREPl-nucl, pTREPl-nuc2 and 
pTREPl-nuc3 series of reporter vectors. General molecular biology techniques were 
carried out using the reagents and buffer supplied by the manufacture or using 
standard conditions(Sambrook and Maniatis, (1989), supra). In each of the pTREPl- 
nuc vectors the expression cassette comprises a transcription terminator, lactococcal 
promoter PI, unique cloning sites (Bglll, EcoRV or Smal) followed by the mature 
form of the nuc gene and a second transcription terminator. Note that the sequences 
required for translation and secretion of the nuc gene were deliberately excluded in 
this construction. Such elements can only be provided by appropriately digested 
foreign DNA fragments (representing the target bacterium) which can be cloned into 
the unique restriction sites present immediately upstream of the nuc gene. 
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In possessing a promoter, the pTREPl-nuc vectors differ from the pFUN vector 
described by Poquet et al. (1998), supra, which was used to identify L. lactis 
exported proteins by screening directly for Nuc activity directly inL. lactis. As the 
pFUN vector does not contain a promoter upstream of the nuc open reading frame 
the cloned genomic DNA fragment must also provide the signals for transcription in 
addition to those elements required for translation initiation and secretion of Nuc. 
This limitation may prevent the isolation of genes that are distant from a promoter 
for example genes which are within polycistronic operons. Additionally there can be 
no guarantee that promoters derived from other species of bacteria will be 
recognised and functional in L, lactis. Certain promoters may be under stringent 
regulation in the natural host but not in L. lactis. In contrast, the presence of the PI 
promoter in the pTREPl-nuc series of vectors ensures that promoterless DNA 
fragments (or DNA fragments containing promoter sequences not active in L. lactis) 
will still be transcribed. 

(d) Screening for secreted proteins in S. pneumoniae 

Genomic DNA isolated from 5. pneumoniae was digested with the restriction 
enzyme Tru9L This enzyme which recognises the sequence 5'- TTAA -3' was used 
because it cuts A/T rich genomes efficiently and can generate random genomic 
DNA fragments within the preferred size range (usually averaging 0.5 - 1.0 kb). 
This size range was preferred because there is an increased probability that the PI 
promoter can be utilised to transcribe a novel gene sequence. However, the PI 
promoter may not be necessary in all cases as it is possible that many Streptococcal 
promoters are recognised in L. lactis. DNA fragments of different size ranges were 
purified from partial Tru9I digests of S. pneumoniae genomic DNA. As the Tru 91 
restriction enzyme generates staggered ends the DNA fragments had to be made 
blunt ended before ligation to the EcoRV or Smal cut pTREPl-nuc vectors. This 
was achieved by the partial fill-in enzyme reaction using the 5 '-3' polymerase 
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activity of Klenow enzyme. Briefly Tru9I digested DNA was dissolved in a solution 
(usually between 10-20 fil in total) supplemented with T4 DNA ligase buffer (New 
England Biolabs; NEB) (IX) and 33 /xM of each of the required dNTPs, in this case 
dATP and dTTP. Klenow enzyme was added (1 unit Klenow enzyme (NEB) per fig 
5 of DNA) and the reaction incubated at 25°C for 15 minutes. The reaction was 
stopped by incubating the mix at 75°C for 20 minutes. EcoRV or Smal digested 
pTREP-nuc plasmid DNA was then added (usually between 200-400 ng). The mix 
was then supplemented with 400 units of T4 DNA ligase (NEB) and T4 DNA ligase 
buffer (IX) and incubated overnight at 16°C. The ligation mix was precipitated 
10 directly in 100% Ethanol and 1/10 volume of 3M sodium acetate (pH 5.2) and used 
to transform L. lactis MG1363 (Gasson, 1983). Alternatively, the gene cloning site 
of the pTREP-nuc vectors also contains a BgHI site which can be used to clone for 
example Sau3AI digested genomic DNA fragments. 

L. lactis transformant colonies were grown on brain heart infusion agar and nuclease 

15 secreting (Nuc+) clones were detected by a toluidine blue-DNA-agar overlay (0.05 
M Tris pH 9.0, 10 g of agar per litre, 10 g of NaCl per liter, 0. 1 mM CaC12, 0.03% 
wt/vol . salmon sperm DNA and 90 mg of Toluidine blue O dye) essentially as 
described by Shortle, 1983, supra and Le Loir et aL, 1994, supra). The plates were 
then incubated at 37°C for up to 2 hours. Nuclease secreting clones develop an 

20 easily identifiable pink halo. Plasmid DNA was isolated from Nuc+ recombinant L. 
lactis clones and DNA inserts were sequenced on one strand using the NucSeq 
sequencing primer described in Appendix 1 , which sequences directly through the 
DNA insert. 

25 Isolation of Genes Encoding Exported Proteins from 
S. pneumoniae 

A large number of gene sequences putatively encoding exported proteins in S. 
pneumoniae have been identified using the nuclease screening system. These have 
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now been further analysed to remove artefacts. The sequences identified using the 
screening system have been analysed using a number of parameters. 

1. All putative surface proteins were analysed for leader/signal peptide 
sequences using the software programs Sequencher (Gene Codes Corporation) and 
DNA Strider (Marck, Nucleic Acids Res., 16:1829-1836 (1988)). Bacterial signal 
peptide sequences share a common design. They are characterised by a short 
positively charged N-terminus (N region) immediately preceding a stretch of 
hydrophobic residues (central portion-h region) followed by a more polar C-terminal 
portion which contains the cleavage site (c-region). Computer software is available 
which allows hydropathy profiling of putative proteins and which can readily 
identify the very distinctive hydrophobic portion (h-region) typical of leader peptide 
sequences. In addition, the sequences were checked for the presence of or absence of 
a potential ribosomal binding site (Shine-Dalgarno motif) required for translation 
initiation of the putative nuc reporter fusion protein. 

2. All putative surface protein sequences were also matched with all of the 
protein/DNA sequences using the publicly databases [OWL-proteins inclusive of 
SwissProt and GenBank translations]. This allows us to identify sequences similar to 
known genes or homologues of genes for which some function has been ascribed. 
Hence it has been possible to predict a function for some of the genes identified 
using the LEEP system and to unequivocally establish that the system can be used to 
identify and isolate gene sequences of surface associated proteins. We should also be 
able to confirm that these proteins are indeed surface related and not artifacts. The 
LEEP system has been used to identify novel gene targets for vaccine and therapy. 

3. Some of the genes identified proteins did not possess a typical leader 
peptide sequence and did not show homology with any DNA/protein sequences in 
the database. Indeed these proteins may indicate the primary advantage of our 
screening method, i.e. the isolation of atypical surface-related proteins, which may 
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have been missed in all previously described screening protocols or approaches 
based on sequence homology searches. 

In all cases, only partial gene sequences were initially obtained. Full length genes 
5 were obtained in all cases by reference to the TIGR S. pneumoniae database 

( www@tigr.org) . Thus, by matching the originally obtained partial sequences with 
the database, we were able to identify the full length gene sequences. In this way, as 
described herein, three groups of genes were clearly identified, ie a group of genes 
encoding previously unidentified S. pneumoniae proteins, a second group exhibiting 
10 some homology with known proteins from a variety of sources and a third group 

which encoded known S. pneumoniae proteins, which were, however, not known as 
antigens. 

Example 2: Vaccine trials 

15 

pcDNA3.1 + as a DNA vaccine vector 
pcDNA3.1 + 

20 The vector chosen for use as a DNA vaccine vector was pcDNA3.1 (Invitrogen) 
(actually pcDNA3. 1 + , the forward orientation was used in all cases but may be 
referred to as pcDNA3.1 here on). This vector has been widely and successfully 
employed as a host vector to test vaccine candidate genes to give protection against 
pathogens in the literature (Zhang, et aL y Kurar and Splitter, Anderson et aL). The 

25 vector was designed for high-level stable and non-replicative transient expression in 
mammalian cells. pcDNA3.1 contains the ColEl origin of replication which allows 
convenient high-copy number replication and growth in E. coli. This in turn allows 
rapid and efficient cloning and testing of many genes. The pcDNA3.1 vector has a 
large number of cloning sites and also contains the gene encoding ampicillin 

30 resistance to aid in cloning selection and the human cytomegalovirus (CMV) 

immediate-early promoter/enhancer which permits efficient, high-level expression of 
the recombinant protein. The CMV promoter is a strong viral promoter in a wide 
range of cell types including both muscle and immune (antigen presenting) cells. 
This is important for optimal immune response as it remains unknown as to which 

35 cells types are most important in generating a protective response in vivo. A T7 
promoter upstream of the multiple cloning site affords efficient expression of the 
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modified insert of interest and which allows in vitro transcription of a cloned gene in 
the sense orientation. 

Zhang, D., Yang, X., Berry, J. Shen, C, McClarty, G. and Brunham, R.C. (1997) 
5 "DNA vaccination with the major outer-membrane protein genes induces acquired v 
immunity to Chlamydia trachomatis (mouse pneumonitis) infection". Infection and 
Immunity, 176, 1035-40. 

Kurar, E. and Splitter, G.A. (1997) "Nucleic acid vaccination of Brucella abortus 
10 ribosomal L7/L12 gene elicits immune response". Vaccine, 15, 1851-57. 

Anderson, R., Gao, X.-M., Papakonstantinopoulou, A., Roberts, M. and Dougan, 
G. (1996) "Immune response in mice following immunisation with DNA encoding 
fragment C of tetanus toxin". Infection and Immunity, 64, 3168-3173. 

15 

Preparation of DNA vaccines 

Oligonucleotide primers were designed for each individual gene of interest derived 
using the LEEP system. Each gene was examined thoroughly, and where possible, 

20 primers were designed such that they targeted that portion of the gene thought to 
encode only the mature portion of the gene protein. It was hoped that expressing 
those sequences that encode only the mature portion of a target gene protein, would 
facilitate its correct folding when expressed in mammalian cells. For example, in the 
majority of cases primers were designed such that putative N-terminal signal peptide 

25 sequences would not be included in the final amplification product to be cloned into 
the pcDNA3.1 expression vector. The signal peptide directs the polypeptide 
precursor to the cell membrane via the protein export pathway where it is normally 
cleaved off by signal peptidase I (or signal peptidase II if a lipoprotein). Hence the 
signal peptide does not make up any part of the mature protein whether it be 

30 displayed on the surface of the bacteria surface or secreted. Where an N-terminal 
leader peptide sequence was not immediately obvious, primers were designed to 
target the whole of the gene sequence for cloning and ultimately, expression in 
pcDNA3.1. 

35 Having said that, however, other additional features of proteins may also affect the 
. expression and presentation of a soluble protein. DNA sequences encoding such 
features in the genes encoding the proteins of interest were excluded during the 
design of oligonucleotides. These features included: 

40 1. LPXTG cell wall anchoring motifs. 

2. LXXC ipoprotein attachment sites. 

3. Hydrophobic C-terminal domain. 
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4. Where no N-terminal signal peptide or LXXC was present the start codon was 
excluded. 

5. Where no hydrophobic C-terminal domain or LPXTG motif was present the stop 
codon was removed. 

5 

Appropriate PCR primers were designed for each gene of interest and any and all of 
the regions encoding the above features was removed from the gene when designing 
these primers. The primers were designed with the appropriate enzyme restriction 
site followed by a conserved Kozak nucleotide sequence (in most cases(NB except in 
10 occasional instances for example ID59) GCCACC was used. The Kozak sequence 
facilitates the recognition of initiator sequences by eukaryotic ribosomes) and an 
ATG start codon upstream of the insert of the gene of interest. For example the 
forward primer using a BamH 1 site the primer would begin 

GCGGGATCCGCCACCATG followed by a small section of the 5' end of the gene 
15 of interest. The reverse primer was designed to be compatible with the forward 
primer and with a Notl restriction site at the 5' end in most cases (this site is 
TTGCGGCCGC) (NB except in occasional instances for example ID59 where a 
Xhol site was used instead of Notl). 

20 PCR primers 

The following PCR primers were designed and used to amplify the truncated genes 
of interest. 

25 ID5 

Forward Primer 5' 

CGGATCCGCC ACCATGGGTCT A ATTGA AG ACTTA A A AAATCA A 3 ' 
Reverse Primer 5' TTGCGGCCGCCAATGCTAGACTAAACACAAGACTCA 3' 

30 

ID59 

Forward Primer 5* CGCGGATCCATGAAAAAAATCTATTCATTTTTAGCA 3' 
Reverse Primer 5' CCCTCGAGGGCTACTTCCGATACATTTTAAACTGTAGG 
35 3' 



ID51 

40 Forward Primer 5' CGGATCCGCCACCATGAGTCATGTCGCTGCAAATG 3' 
Reverse Primer 5' TTGCGGCCGC AT ACC A A ACGCTG AC ATCTACG 3* 
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ID29 

Forward Primer 5' CGGATCCGCCACCATGCAAAAAGAGCGGTATGGTTATG 
3' 

5 Reverse Primer 5" TTGCGGCCGCACCCCCATTCTTAATCCCTT 3' 
ID50 

Forward Primer 5' 

1 0 CGG ATCCGCC ACC ATGG AGGTATGTG A A ATGTC ACGTA A A 3 1 

Reverse Primer 5' TTGCGGCCGCTTTTACAAAGTCAAGCAAAGCC 3' 

Cloning 

15 The insert along with the flanking features described above was amplified using PCR 
against a template of genomic DNA isolated from type 4 S. pneumoniae strain 1 1886 
obtained from the National Collection of Type Cultures. The PCR product was cut , 
with the appropriate restriction enzymes and cloned in to the multiple cloning site of 
pcDNA3.1 using conventional molecular biological techniques. Suitably mapped 

20 clones of the genes of interested were cultured and the plasmids isolated on a large 
scale (> 1.5 mg) using Plasmid Mega Kits (Qiagen). Successful cloning and 
maintenance of genes was confirmed by restriction mapping and sequencing -~ 700 
base pairs through the 5 ' cloning junction of each large scale preparation of each 
construct. 

25 

Strain validation 

A strain of type 4 was used in cloning and challenge methods which is the strain 
from which the S. pneumoniae genome was sequenced. A freeze dried ampoule of a 

30 homogeneous laboratory strain of type 4 5. pneumoniae strain NCTC 11886 was 

obtained from the National Collection of Type Strains. The ampoule was opened and 
the cultured re suspended with 0.5 ml of tryptic soy broth (0.5% glucose, 5% 
blood). The suspension was subcultured into 10 ml tryptic soy broth (0.5% glucose, 
5% blood) and incubated statically overnight at 37 °C. This culture was streaked on 

35 to 5 % blood agar plates to check for contaminants and confirm viability and on to 
blood agar slopes and the rest of the culture was used to make 20% glycerol stocks. 
The slopes were sent to the Public Health Laboratory Service where the type 4 
serotype was confirmed. 

40 A glycerol stock of NCTC 1 1886 was streaked on a 5% blood agar plate and 

incubated overnight in a C02 gas jar at 37°C. Fresh streaks were made and optochin 
sensitivity was confirmed. 
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Pneumococcal challenge 



A standard inoculum of type 4 S. pneumoniae was prepared and frozen down by 
5 passaging a culture of pneumococcus lx through mice, harvesting from the blood of 
infected animals, and grown up to a predetermined viable count of around 10 9 
cfu/ml in broth before freezing down. The preparation is set out below as per the 
flow chart. 

10 Streak pneumococcal culture and confirm identity 



15 Grow over-night culture from 4-5 colonies on plate above 



20 Animal passage pneumococcal culture 

(i.p. injection of cardiac bleed to harvest) 



25 



30 



35 



I 

V 

Grow over-night culture from animal passaged pneumococcus 

I 

V 

Grow day culture (to pre-determined optical density) from over-night of animal 
passage and freeze down at -70 °C - This is standard minimum 



Thaw one aliquot of standard inoculum to viable count 



i 

40 V 



Use standard inoculum to determine effective dose (called Virulence Testing) 



WO 00/06738 PCT/GB99/02452 

29 



5 All subsequent challenges - use standard inoculum to effective dose 

An aliquot of standard inoculum was diluted 500x in PBS and used to inoculate the 
mice. 

10 Mice were lightly anaesthetised using halothane and then a dose of 1.4 x 10 5 cfu of 
pneumococcus was applied to the nose of each mouse. The uptake was facilitated by 
the normal breathing of the mouse, which was left to recover on its back. 

5. pneumoniae Vaccine trials 

15 

Vaccine trials in mice were carried out by the administration of DNA to 6 week old 
CBA/ca mice (Harlan, UK). Mice to be vaccinated were divided into groups of six 
and each group was immunised with recombinant pcDNA3.1+ plasmid DNA 
containing a specific target-gene sequence of interest. A total of 100 fig of DNA in 

20 Dulbecco's PBS (Sigma) was injected intramuscularly into the tibialis anterior 
muscle of both legs (50 p\ in each leg). A boost was carried using the same 
procedure 4 weeks later. For comparison, control groups were included in all 
vaccine trials. These control groups were either unvaccinated animals or those 
administered with non-recombinant pcDNA3.1 + DNA (sham vaccinated) only, 

25 using the same time course described above. 3 weeks after the second immunisation, 
all mice groups were challenged intra-nasally with a lethal dose of S. pneumoniae 
serotype 4 (strain NCTC 11886). The number of bacteria administered was 
monitored by plating serial dilutions of the inoculum on 5% blood agar plates. A 
problem with intranasal immunisations is that in some mice the inoculum bubbles out 

30 of the nostrils, this has been noted in results table and taken account of in 

calculations. A less obvious problem is that a certain amount of the inoculum for 
each mouse may be swallowed. It is assumed that this amount will be the same for 
each mouse and will average out over the course of inoculations. However, the 
sample sizes that have been used are small and this problem may have significant 

35 effects in some experiments. All mice remaining after the challenge were killed 3 or 
4 days after infection. During the infection process, challenged mice were monitored 
for the development of symptoms associated with the onset of S. pneumoniae 
induced-disease. Typical symptoms in an appropriate order included piloerection, 
an increasingly hunched posture, discharge from eyes, increased lethargy and 

40 reluctance to move. The latter symptoms usually coincided with the development of 
a moribund state at which stage the mice were culled to prevent further suffering. 
These mice were deemed to be very close to death, and the time of culling was used 
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to determine a survival time for statistical analysis. Where mice were found dead, 
the survival time was taken as the last time point when the mouse was monitored 
alive. 

Interpretation of Results 

A positive result was taken as any DNA sequence that was cloned and used in 
challenge experiments as described above which gave protection against that 
challenge. Protection was taken as those DNA sequences that gave statistically 
significant protection (to a 95% confidence level (p<0.05)) and also those which 
were marginal or close to significant using Mann-Whitney or which show some 
protective features for example there were one or more outlying mice or because the 
time to the first death was prolonged. It is acceptable to allow marginal or non- 
significant results to be considered as potential positives when it is considered that 
the clarity of some of the results may be clouded by the problems associated with the 
administration of intranasal infections. 
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p value 2 refers to significance tests compared to pcDNA3. 1 4- vaccinated controls 
Statistical Analyses. 

Trial 1 - None of the other groups had significantly longer survival times than the 
5 controls. The survival times of the unvaccinated and pcDNA3. 1 control groups were 
not significantly different. One of the mice from ID5 was an outlying result and the 
mean survival times for IDS were extended but not significantly so. 
Trial 2 - The group vaccinated with ID59 had significantly longer survival times than 
the unvaccinated control group. 
10 Trial 5 - The group vaccinated with ID59 again survived for an average of almost 10 
hours longer than the controls but the results were not quite statistically significant. 
Trial 6 - The group vaccinated with ID51 did not have survival times significantly 
higher than unvaccinated controls (p= <36.0), however, there were 2 outlying mice 
in the vaccinated group. 

15 

Vaccine trials 7 and 8 (See figure 2) 





Mean survival times (hours) 


Mouse 


Unvacc 


ID29 (7) 


Unvacc 


ID50 (8) 


number 


control (7) 




control (8) 




1 


59.6 


73.1 


45.1 


60.6 


2 


47.2 


54.8 


50.8 


60.6 


3 


59.6 


59.3 


60.4 


51.1 


4 


70.9 


54.8* 


55.2 


60.6 


5 


68.6* 


59.3 


45.1 


60.6 


6 


76.0 


54.8 


45.1 


60.6 


Mean 


63.6 


59.35 


50.2 


59.1 


sd 


10.3 


7.1 


6.4 


3.9 


p value 1 




<39.0 




0.0048 



* - bubbled when dosed so may not have received full inoculum. 



20 T - terminated at end of experiment having no symptoms of infection. 
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Numbers in brackets - survival times disregarded assuming incomplete dosing 
p value 1 refers to significance tests compared to unvaccinated controls 

Statistical Analyses. 

Trial 7 - The ID29 vaccinated group showed prolonged times to the first death. T 
Trial 8 - The group vaccinated with ID50 survived significantly longer than 
unvaccinated controls. 
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Appendix I - Oligonucleotide primers 

nucSl 

BglH EcoRV 
5 5 1 - cgagatctgatatctcacaaacagataacggcgtaaatag -3 ' 

nucS2 

Bgl II Sma I 

5'- gaagatcttccccgggatcacaaacagataacggcgtaaatag -3* 

10 

nucS3 

Bgl II EcoRV 
5'- cgagatctgatatccatcacaaacagataacggcgtaaatag -3' 

15 nucR 

Bam HI 

5'- cgggatccttatggacctgaatcagcgttgtc -3' 
NucSeq 

20 5'- ggatgctttgtttcaggtgtatc -3' 
pTREPF 

5 ' - catgatatcggtacctcaagctcatatcattgtccggcaatggtgtgggctttttttgnttagcggataa 
caatttcacac -3* 

25 

pTREPR 

5 ' - gcggatcccccgggcttaattaatgtttaaacactagtcgaagatctcgcgaattctcctgtgtgaaatt 
gttatccgcta -3' 

30 pUCF 

5*- cgccagggttttcccagtcacgac -3' 

VR 

5'- tcaggggggcggagcctatg -3* 

35 

Vi 

5 ' - tcgtatgttgtgtggaattgtg -3 * 
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V 2 

5'- tccggctcgtatgttgtgtggaattg -3' 
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TABLE 1 



ID4 1200 bp 

5 

ATGAGAAATATG TGGGTT GTAATCAAGGAAACCTATCTTCGACATGTCGAGTCATGGAG I 11 CI 111 11 ATGGTGA 
TTTCGCCGTTCCTCTTTTTAGGAATCTCTGTAGGAATTGGGCATCTCCAAG 

AGTGGCAGTAGTGACAACAGTGCCATCTGTAGCAGAAGGACTGAAGAATGTAAATGGTGTTAACTTCGACTATAA 
AGACGAAGCAAGTGCCAAAGAAGCAATTAAAGAAGAAAAATTAAAAGGTTATTTGACCATTGATCAAGAAGATA 
10 GTGTTCTAAAGGCAGTTTATCATGGCGAAACATCGCTTGAAAATGGAATTAAATTTGAGGTTACAGGTACACTCA 
ATGAACTGCAAAATCAGCTTAATCGTTCAACTGCTTCCTTG 

TTCAATTCACAGAAAAGATTGATGAAGCCAAGGAAAATAAAAAGTTTATTCAAACAATTGCAGCAGGTGCCTTAG 

GATTCrTTCTTTAT ATGATTC TGATTACCTATGCGGGTGTAACAGCTCAGGAAGTTGCCAGTGAAAAAGGCACCAA 

AAT^TGGAAGTCGTTTTTTCTAGCATAAGGGCAAGTCACTATTTCTATGCGCGGATG 
1 5 ATTTTAACGCATATTGGGATCTATGTTGTAGGTGGTCTGGCTGCCGTTTTGCTCTTTAAAGATTT 

TCAGTCTGGTATTTTGGATCACTTGGGAGATGCTATCTCACTGAATACCTTGCTCTTTATTTTG 

TGTACGTAGTCTTGGCAGCCTTCCTAGGATCTATGGTTTCT^ 

GATGATTTTGATTATGGGTGGTTTTTTTGGAGTGACAGCT 

GGTTCITATATTCCCTTTATTTCGACCTT 
20 CATGGATTTCACTTGCTATTACAGTGATTTTTGCGGTGGTAGCAACAGGATTTATCGGACGCATGTATGCTAGTCT 

CGTTCTTC A AACG G ATG ATTT AGG G ATTTGG AAAACCTTT A AA CGTG CCTT ATCTT AT A A AT A G 

MRNMWVVIKETYLRHVESWSFFFMVISPFLFLGISVGIGHLQGSSMAKNNKVAVVTTVPSVAEGLKNVNGVNFDYKD 
EASAKEAIKEEKLKGYLTIDQEDSVLKAVYHGETSLENGIKFEVTGTLNELQNOLNRSTASLSOEQEKRLAQTIQFrEKI 
25 DEAKENKKFIQTIAAGALGFFLYMILITYAG\T*AQEVASEKGTKIMEVVFSSIRASHYFYARMMALFLVILTHIGIYVVG 
GLAAVLLFKDLPFLAQSGILDHLGDAISLm'LLFILISLFMYVVLAAFLGSMVSRPEDSGKAl^PLMILIMGGFFGVTALG 
AAGDNLLLKIGSYIPFISTFFMPFRTINDYAGGAEAWISLAITVIFAVVATGFIGRMYASLVLQTDDLGIWKTFKRALSYK 
Z 

30 IDS 1125 bo 

CCTGGGAAAGTCTTGAAAATTATGATAGAATGGTGGAAGGAAAAATTCAGGAGAGTAGTAGTGACTCAAAATGTT 
GAAAGTCTTCTCGTATCCATTGTAATCAGTGCATACAATGAAGAAAAATATCTGCCTGGTCTAATTGAAGACTTAA 
AAAATCAAACCTATCCTAAAGAGGATATTGAAATTCTATTTATAAATGCTATGTCCACAGATGGGACCACAGCTA 

35 TCATTCA GCAA TTTATAAAGGAAGATACAGAGTTTAACTCAATTAGATTGTATAACAATCCTAAGAAAAATCAAG 
CTAGT GGTTTT AACCTGGGAGTTAAACATrCTGTAGGGGACCTTATTTTAAAAATTGATGCTCAT^ 
TGAGACTTTTGTAATGAACAATGTGGCTATTATTCAACAAGGTGAATTTGTCTGTGGGGGGCCTAGACCGACGATT 
GTCGAAGGAAAAGGAAAATGGGCAGAGACCTTGCATCTTGTTGAGGAAj\ATATGTTTGGCAGTAGCATTGCCAAT 
TATCGAAATAGTTCTGAGGATAGATATGTTTCTTCTATTTTTCATGGAATGTATAAACGAGAGGTTTTCCAGAAGG 

40 TTGGTTTAGTAAATGAGCAACTTGGCCGAACTGAAGATAATGATATTCATTATAGAATTCGAGAATATGGTTATAA 
AATCCGCTATAGCCCAAGTATTCTATCTTATCAGTATATTCGACCAACATTCAAGAAAATGCTGCATCAAAAGTAT 
TCAAATGGTTTGTGGATTGGCTTGACAAGTCATGTTCAGTC 
ATTT^TTTTGACTCTTGTGTTTAGTCT 
ATTTTCTACTTTTGTCATTACT 

45 ATTTTATTTTCCATTCACTTTGCTTATGGCCTTGGGACGATTGTAGGTTTAATr 

AGTACAAGAGAACAATAATTTATTTGGATAAAATAAGCCAAATAAATCAAAATATGCTATAA 

PGKVLKIMIEWWKEKFRRVVVTQNVESLLVSIV1SAYNEEKYLPGLIEDLKNQTYPKEDIE1LFINAMSTDGTTA1IQQFIK 
EDTEFNSIRLYNNPKKNQASGFNLGVKHSVGDLILKIDAHSKVTETFVMNNVAIIQQGEFVCGGPRPTIVEGKGKWAET 
50 LHLVEENMFGSSIANYRNSSEDRYVSSIFHGMYKREVFQKVGLVNEQLGRTEDNDIHYRJREYGYKIRYSPSILSYQYIRP 
TFKKMLHQKYSNGLWIGLTSHVOFKCLSLFHYVPCLFVLSLVFSLALLPITFVFITLLLGAYFLLLSLLTLLTLLKHKNGF 
LIVMPFILFSIHFAYGLGTIVGLIRGFKWKKEYKRTIIYLDKISQINQNML2 



ID11 696 bo 

A TGAT GAAAGAACAAAATACGATAGAAATCGATGTATTTCAATTAGTTAAAAGCTTGTGGAAACGCAAGCTAATG 
ATTTTAATAGTGGCACTTGTGACAGGTGCGGGGGCTTTTGCATATAGCACTTTTATTGTTAAGCCAGAATAT 
GTACCACGCGAATTTACGTAGTGAATCGCAATCAj\GGAGACAAGCCGGGGTTGACAAATCAGGATTTGCAGGCAG 
GAACTTATCTGGTAAAAGACTACCGTGAGATTATCCTTTCGCAGGATGTTTTGGAGGAA 

ACTAGATTTGACGCCAAAAGGTTTGGCTAATAAAATTAAAGTGACAGTACCAGTTGATACCCGTATTGTCTCTATT 
TCAGTTAATGATCGAGTTCCTGAAGAGGCAAGCCGTATCGCTAACTCTTTGAGAGAAGTAGCTGCTCAAAAAATT 
ATC AGT ATT ACT CGTGTTTCTG ACGTGAC A A C ACTG G AGG AGG C A AG G CCGG CG AT ATCC C CGTCTTCG CC AA AT 
ATTAAACGCAATACACTAATTGGTTTTTTGGCAGGGGTGATTGGAACTAGTGTTATAGTTCTTCATCIT 
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GGATACTCGTGTGAAACGTCCGGAAGATATCGAAAATACATTGCAGATGACACTTTTGGGAGTTGTGCCAAACTT 
GGGTAAGTTGAAATAG 

MMKEQOTIEIDVFQLVKSLWKI^MILIVALVTGAGAFAYSTFIVKPEYTSTTRIYVVNRNQGDKPGLTNQDLQAGTYL 
5 VKDYREIILSQDVl^EWSDUCLDLTPKGlJVNKlKVTVPVDTRrVSISVNDRVPEEASRIANSLREVAAOiaiSrrRVSDVT 
TLEEARPAISPSSPNIKRNTTLIGFLAGVIGTSVIVLHLELLDTRVKRPEDIEOTLOMTLLGVVPNLGKLKZ 

ID19 555 bo 

10 ATGGTAAAAGTAGCAGTTATATTAGCTCAGGGCTTTGAAGAAATTGAAGCCTTGACAGTTGTAGATGTCTTGCGTC 
GAGCCAATATCACATGTGATATGGTTGGTTTTGAAGAGCAAGTAACGGGTTCGCATGCAATCCAAGTAAGAGCAG 
ATCATGTCTTTGATGGAGATTTATCAGACTATGATATGATTGTTCITCCTGGAGGTATGCCTGGTTCTGCACA 
CGTGATAATCAGACCTTGATTCAAGAATTGCAAAGCTTCGAGCAAGAAGGGAAGAj\ACTAGCAGCCATTTGTGCG 
GCACCAATTGCCCTCAATCAAGCAGAGATATTGAAAAATAAGCGATACACTTGTTATGACGGCGTTCAAGAGCAA 

15 ATCCTTGATGGTCACTACGTCAAGGAAACAGTAGTGGTAGATGGTCAGTTGACAACCAGTCGGGGTCCTTCAACA 
GCCCTTGCCTTTGCCTACGAGTTGGTGGAGCAACTAGGAGGGGACGCAGAGAGTTTACGAACAGGAATGCTCTAT 
CGAGATGTCTTTGGTAAAAATCAGTAA 

MVKVAVILAQGFEEIEALTVVDVLRRANITCDMVGFEEQVTGSHAIQVRADHVFDGDLSDYDMIVLPGGMPGSAHLR 
20 DNQTLIQELQSFEQEGKKLAAICAAPIALNOAE1LKNKRYTCYDGVQEQILDGHYVKETVVVDGQLTTSRGPSTALAFA 
YELVEQLGGDAESLRTGMLYRDVFGKNQZ 

ID27 306 bp 

25 GTGGTAGGGATGGTAGAACCAAACCTAGAAAGCCTTATAAAAGATCTTTACAATCATGCTCGACATGATTTGAGT 
GAAGATTTAGTTGCTGCTCTCCTAGAGACTACTAAAAAACTGCCTACTACAAATGAGCAATTGCAGGCAG 
TCTCAGGCCTGGTCAATCGTGAATTGCTCCTAAATCCCAAACATCCAGCACCTGAGTTGCTCAACTTGGCT 
TGTCAAAAGAGAAGAAGCCAAGTACAGAGGAACTGCGACTTCTGCGCTTATGTATGAGGAACTCTTTAAAATGCT 
TTGA 



30 



35 



MVGMVEPNLESLIKDLYNHARHDI^EDLVAALLETTKKLPTTNEQLQAVRLSGLVNRELLLNPKHPAPELLNLARFVK 
REEAKYRGTATSALMYEELFKML2 

IP29 945 bp 



TTGTTCTTAAAAAA GGAA AGAGAGGTAATCAGCATGCGTAAATGGACAAAAGGATTTCTCATCTTTGGTGTGGTG 
ACTACCGTTATCGGCTTTATCCTGCTTITTGTAGGTATCCAATCrGACG 

AGAACCTGTCTATGATAGCCGTACGGAAAAGCTAACCTTTGGCAAGGAAGTCGAAAACCTAGAAATTACTCTCCA 
CCAACACACGCTCACCATCACAGACTCTTTCGATGATCAAATCCACATTTCTTACCATCCATCTCITTCTGCT 

40 CATGATCTTATCACCAATCAGAACGATAGAACTCTGAGTCTCACTGATAAGAAACTGTCTGAAACTCCGTTTCT 
CITCTGGAATTGGTGGGATTCTTCATATCGCAAGTAGCTACTCTAGTCGTTTTGAAGAAGTTATTCTCCG 
AAAAGGGAGAACTCTAAAAGGGATCAACATCTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCCTTGA 
AAATGCGACCCTCAATACAAACAGCTATATCCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAAC 
GCCCAATATCGTTAATATCTTTGATACAGTTCTTACAGATAGTCAGCTAGAGTCAACAGAGAATCACTTCCACGCT 

45 GAAAATATCCAAGTCCATGGCAAGGTTGAACTGACTGCCAAAGATTATCTCAGAATCATCCTAGACCAGAAAGAA 
AG CC A ACG AATT AACTGG G A C ATCTC AAG C A ACT ATG GTTCT ATCTTCC AATTC AC A AG AG A AA AG CCTG AATC A 
AGAGGTACGGAATTAAGCAACCCTTACAAAACTGAAAAAACCGATGTCAAGGATCAACTCA7TGCGAGATCTGAT 
GATAATATTGATCTAATATCCACACCAAGCAGACGTTGA 

50 MFLKKEREVISMRKWTKGFLIFGV\TTVIGFILLFVGIQSDGIKSLLSMSKEPVYDSRTEKLTFGKEVENLErTLHOHTLTI 
TDSFDDQIHISYHPSLSAHHDLITNQNDRTLSLTDKKLSETPFLSSGIGGILH1ASSYSSRFEEVILRLPKGRTLKGINISANR 
GQTTIINASLENATLNTNSYILRIEGSR1KNSKLTTPNIVNIFDTVLTDSQLESTENHFHAENIQVHGKVELTAKDYLRJILD 
QKESQR1NWDISSNYGSIFQFTREKPESRGTELSNPYKTEKTDVKDQLIARSDDNIDLISTPSRR2 

55 ID30 879 bo 

ATGAAACAAGAATGGTTTGAAAGTAATGATTTTGTAAAAACAACAAGCAAGAACAAGCCTGAAGAGCAAGCTCA 
AGAGGTTGCAGACAAGGCTGAAGAAACGATAGCCGATCTCGATACACCAATTGAAAAAAATACTCAGTTAGAGG 
AGGAAGTCCCTCAAGCTGAAGTCGAATTGGAAAGCCAGCAAGAAGAGAAAATTGAAGCTCCTGAAGACAGTGAA 

OO GCGAGAACAGAAATAGAAGAAAAGAAGGCATCTAATTCTACTGAAGAAGAGCCAGACCTTTCTAAAGAAACAGA 
AAAAGTCA CTAT AGCTGAAGAGAGCCAAGAAGCTCTTCCTCAGCAAAAAGCAACCACGAAAGAGCCACITCTTAT 
CAG TAAAT CITTAGAAAGTCCTTATATCCCCGACCAAGCTCCAAAATCTAGGGATAAATGGAAAGAGCAAGTGCT 
TGAi I I I lGG TCTTGGCTAGTGG AAGCG ATCAAATCT CCTAC AAGTAAGTTGGAAACAAGTATCACACACAGTTAC 
AC AGCCTTTCTCTTGCTCATTCTGTTTTCTGC ATCTTCC 1 1 I I I L IT! AGTATCTATCACATCAAACATGCTTACTAT 

65 GGACATATAGCAAGCATTAACAGTCGCrrCCCTGAGCAGCTAGCTCCTTTAACT L ' ri ' I ' ri ' i CTATCATCTCTATCCT 
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AGTAGCGACAACACTCTTCTTCTTTTCATTCCTCTTC 

GACTGGACGCTAGACAAGGTTCTCCAACAATATAGTCAACTCTTGGCAATTCCAATCTCCTCACTGCTATTGCTAG 
TTTCTTTGCrrrCTTTGATAGCCTACGATTTACAGCCCTCTTGTGTGTGA 

5 MKQEWFESNDFVKTTSKNKPEEOAQEVADKAEET1ADLDTPIEKNTQLEEEVPQAEVE1.ESQQEEKIEAPEDSEARTEIE 
EKKASNSTEEEPDLSKETEKVTIAEESQEALPOQKATTKEPLLISKSLESPYIPDQAPKSRDKWKEQVLDFWSWLVEAIKS 
PTSKLETSITHSYTAFLLLILFSASSFFFSIYHIKHAYYGHIASINSRFPEQI^ 
QEKDWTLDKVLQQYSQLLAIPISSLLLLVSLLSLIAYDLQPSCV2 

10 ID105 990 bo 

ATGCAACTCGCTTCTTCGGTCTACTCATTGTTCGTCTGGTA 
GCATGCGTAAATGGACAAAAGGATTTCTCATCTTTGGTGTC 

GGTAT CCAA TCTGACGGGATTAAGAGCCTACTTTCCATGTCCAAAGAACCTGTCTATGATAGCCGTACGGAAAAG 
15 CTAACCITTGGCAAGGAAGTCGAAAACCTAGAAATTACTCTCCACCAACACACGCTCACCATCACAGACTCTTTC 
GATGATCAAATCCACATTTCTTACCATCCATCTCTTTCTGCTCACCATGATCTTATCACCAATCAGAACGATAGAA 
CTCTGAGTCTCACTGATA AGAA ACTGTCTGAAACTCCGTTTCTCT 

AAGTAGCTACTCTAGTCGTTTTGAAGAAGTTATTCTCCGACTACCAAAAGGGAGAACTCTAAAAGGGATCAACAT 
CTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCCTTGAAAATGCGACCCTCAATACAAACAGCTATAT 

20 CCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAACGCCCAATATCGTTAATATCTTTGATACAGTT 
CTTACAGATAGTCAGCTAGAGTCAACAGAGAATCACTTCCACGCTGAAAATATCCAAGTCCATGGCAAGGTTGAA 
CTGACTGCCAAAGATTATCTCAGAATCATCCTAGACCAGAAAGAAAGCCAj\CGAATTAACTGGGACATCTCAAGC 
AACTATGGTTCTATCTTCCAATTCACAAGAGAAAAGCCTGAATCAAGAGGTACGGAATTAAGCAACCCTTACAAA 
ACTGAAAAAACCGATGTCAAGGATCAACTCATTGCGAGATCTGATGATAATATTGATCTAATATCCACACCAAGC 

25 AGACGTTGA 

MQLASSVYSLFVWYNLFLKKEREVISMRKWTKGFLIFGVVTTVIGFILLFVGIQSDGIKSLLS 

KEVENLEITLHQHTLTITDS FDDQIHIS YHPSLS AHHDLITNQ NDRTLSLTDKKLSETPFLSSGIGG ILHIASS YSS RFEEVIL 
RLPKGRTLKGINISANRGQTTIINASLENATLOTNSYILRIEGSRIKNSKLTTPNrVNIFDTVLTDSQLESTENHFHAENIQV 
30 HGKVELTAKDYLRIILDQKESQRJNWDISSNYGSIFQFTREKPESRGTELSNPYKTEKTDVKDQLIARSDDNIDLISTPSRR 
Z 

IP107 -78bo 

35 

ATGATATGTAAAATGAAGCAGGGAGGGAGCAGGGCGTGCTGGGGATGGAGAGTGGGGGAGGGACGCTGCTATTT 
TAATC 

MICKMKQGGSRACWGWRVGEGRCYFN 
ID109 714 bo 

CGATAA^GAGGCCTTGAGTAATCTCAATTTGCAGATTGAAAATGGAGAGATTATGGGCTTGATTGGTCATAATGG 
45 GGCTGGAAAATCGACCACTATAAAATCCCTAGTCAGTATCATTTCACCCAGCAGTGGTCGTATTTTGGTAGACGGT 
CAGGAGTTATCGGAAAATCGCTTGGCTATTAAACGAAAGATTGGCT 

GCTTAACGGCCAATGAA TTTTG G GAATT GATCGCCTCATCCTATGATCTGAGTAGATCTGACTTGGAGGCTAGTCT 
AGCTAGGCTATTGAACGTTTTTGATTTTGCTGAAAATCGCTATCAGGTTATTGAA 

CAGAAAGTCTTTGTCATCGGAGCACTCTTGTCTGATCCCGATATTTGGGTTTTGGACGAACCCTTGACT 
50 ATCCCCAGGCTGCCTTTGATTTGAAACAGATGATGAAGGAACATGCACAAAAAGGGAAGACAGTCTTGTTTTCAA 
CTCATGTCCTAGAGGTGGCAGAGCAAGTCTGTGATCGGATTGCCATTTTGAAAAAGGGGCATTTGATTTATTGTGG 
TAAGGTAGAGGACTrGAGGAAAGACCACCCAGACCAGTCTTTGGAAAGTATCTACCTTAGTCTTGCTGGTAGAAA 
AGAGGAGGTTGCGGATGCGTCTCAAGGTCATTAA 

55 DKEALSNLNLQIENGEIMGLIGHNGAGKSTTIKSLVSIISPSSGRILVDGQELSENRLAIKRKIGYVADSPDLFLRLTANEF 
WELIASSYDLSRSDLEASLARLLNVFDFAENRYQVIETLSHGMRQKVFVIGALLSDPDIWVLDEPLTGLDPQAAFDLKQ 
MMKEHAQKGKTVLFSTHVLEVAEQVCDRIAILKKGHLIYCGKVEDLRKDHPDQSLESIYLSLAGRKEEVADASQGH2 

ID112 360 bp 

60 

ATGGCTTTGTTTTCAGAGAGAGGAGCAGTACGGAAGACACCAATGGCAAGTCCAATAATGAGACCTATGATGGTT 
CCGACGATAGAGATTAAAAGAGTGATACCAGCACCACGCAAGAGTTGTTGCCAGTTTTCAGAAAGAATTTTAGCA 
ACTTGGCTAAAGAAACTA CTGCT AGTCTCTTCAGTTGTTGTAGCTTCGGCAGGTTGTTCCTTGAT 
TCAAGGCAACTTGGTCATCriU'IGAAATGGTTTCAATGCTGGCATTGATTTGGCTAATACGATTGTCATTTITACGA 
DO AGCCCGATAGCGATAGCTGTATCTTCTTCCCCAGTTTTGAAACCAGGTTCTACTTGA 
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MALFSERGAVRKTPMASPIMRPMMVPTIEIKRVIPAPRKSCCQFSERILAW 
FEMVSMLALIWLIRLSFLRSPIAIAVSSSPVLKPGSTZ 

ID 128 - 3.43 

ATGAAATTTAGTAAAAAATATATAGCAGCTGGATCAGCTGTTATCGTATC 

CTTGAGTCTATGTGCCTATGCACTAAACCAGCATCGTTCGCAGGAAAATA 

AGGACAATAATCGTGTCTCTTATGTGGATGGCAGCCAGTCAAGTCAGAAA 

AGTGAAAACTTGACACCAGACCAGGTTAGCCAGAAAGAAGGAATTCAGGC 

TGAGCAAATTGTAATCAAAATTACAGATCAGGGCTATGTAACGTCACACG 

GTGACCACTATCATTACTATAATGGGAAAGTTCCTTATGATGCCCTCTTT 

AGTGAAGAACTCTTGATGAAGGATCCAAACTATCAACTTAAAGACGCTGA 

T ATTGT C AATG AAGTC A AGGGTG GTT AT ATC AT C A AGGTCG ATGG AA AAT 

ATTATGTCTACCTGAAAGATGCAGCTCATGCTGATAATGTTCGAACTAAA 

GATGAAATCAATCGTCAAAAACAAGAACATGTCAAAGATAATGAGAAGGT 

TAACTCTAATGTTGCTGTAGCAAGGTCTCAGGGACGATATACGACAAATG 

ATGGTTATGTCTTTAATCCAGCTGATATTATCGAAGATACGGGTAATGCT 

T AT ATCGTTCCTC ATG G AGGTC ACT ATC ACT AC ATTC CC AA AAG CG ATTT 

ATCTGCTAGTGAATTAGCAGCAGCTAAAGCACATCTGGCTGGAAAAAATA 

TGCAACCGAGTCAGTTAAGCTATTCTTCAACAGCTAGTGACAATAACACG 

C AATCTGT AG C A A A AG G ATC A ACT AG C A AG C C AG C AAAT A A ATCTG A AAA 

TCTCC AG AGTC 11 1 1 1 1 G A AGG AACTCTATG ATTC ACCT AGCGCCC A ACGTT 

ACAGTGAATCAGATGGCCTGGTCTTTGACCCTGCTAAGATTATCAGTCGT 

ACACCAAATGGAGTTGCGATTCCGCATGGCGACCATTACCACTTTATTCC 

TT AC AG C AAG CTTTCTG CCTT AG AAG AAA AG ATTG CC AG AATG GTG CCT A 

TCAGTGGAACTGGTTCTACAGTTTCTACAAATGCAAAACCTAATGAAGTA 

GTGTCTAGTCTAGGCAGTCTTTCAAGCAATCC I 1L1 1 L 111 AACGACAAG 

TAAGGAGCTCTCTTCAGCATCTGATGGTTATATTTTTAATCCAAAAGATA 

TCGTTG AAG AAACGGCT AC AG CTT AT ATTGT A AG AC ATG GTG ATC ATTTC 

C ATT AC ATTCC A AA ATC AAATC AA ATTGG G C A A CCG A CTCTTCC AAA C AA 

TAGTCTAGCAACACCTTCTCCATCTCTTCCAATCAATCCAGGAACTTCAC 

ATGAGAAACATGAAGAAGATGGATACGGATTTGATGCTAATCGTATTATC 

GCTGAAGATGAATCAGGTTTTGTCATGAGTCACGGAGACCACAATCATTA 

TTTCTTCAAGAAGGACTTGACAGAAGAGCAAATTAAGGTGCGCAAAAACA 

TTTAG 

MKFSKKY1AAGSAVIVSLSLCAYALNQHRSQENKDNNRVSYVDGSQSSQK 

SENLTPDQVSQKEGIQAEQIVIKITDQGYVTSHGDHYHYYNGKVPYDALF 

SEELLMKDPNYQLKDADIVNEVKGGYIIKVDGKYYVYLKDAAHADNVRTK 

DE1NRQKQEHVKDNEKVNSNVAVARSQGRYTTNDGYVFNPADUEDTGNA 

YIVPHGGHYHYIPKSDLSASELAAAKAHLAGKNMQPSQLSYSSTASDNNT 

QSVAKGSTSKPANKSENLQSLLKELYDSPSAQRYSESDGLVFDPAKIISR 

TPNGVAIPHGDHYHFIPYSKLSALEEKIARMVPISGTGSTVSTNAKPNEV 

VSSLGSLSSNPSSLTTSKELSSASDGYIFNPKDIVEETATAYIVRHGDHF 

HYIPKSNQIGQPTLPNNSLATPSPSLPINPGTSHEKHEEDGYGFDANRII 

AEDESGFVMSHGDHNHYFFKKDLTEEQIKVRKNI* 
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TABLE 2 
ID2 840 bp 

5 ATG GGAATTG CTCT AG AAAATGTG A ATTTT A C ATATC AAG AAGGT A CTC CCTT AGCTTC AG C AG CTTTGTCGG ATG 

TTTCTTTGACGATTGAAGATGGCTCTTATACAGCriTAATTGGGCACACAGGTAGTGGTAAATCA^ 
ACTCTTAAATGGTTTATTGGTGCCAAGTCAAGGGAGTGTGAGGGTTTTTGATACCTTAATCA 

AAT A AAG AT ATTCGTC A A ATT AG AA A ACAG GTTG G CTTGGT ATTTC AGTTTG CTG AAA ATC A G ATTTTTG AAG AA A 
CGGTTITGAAGGACGTTGCTTTTGGACCGCAAAATTrrGGAGTTTCTGAAG 

1 0 GAAACTGGCrCTGGTTGGAATTGATGAATCACTTTTTGATCGTAGTCCGTTTGAGCTGTCAGGGGGACAAATGAGA 
CGTGTTGCCATTGCAGGCATACTTGCCATGGAGCCAGCTATATTAGTCTTAGATGAGCCAACAGCTGGTCTAGATC 
CTCTAGGGAGAAAAGAGTTGATGACCCTGTTCAAAAAACTCCACCAGTCAGGGATGACCATCGTCTTGGTAACGC 
ATTTGATGGATGA TGTTG CTGAATATG CGAA TCAAGTCTATGTAATGGAAAAGGGACGTTTAGTAAAGGGGGGCA 
A ACC AAGTG ATGTCTTTC A AG ACGTTGTTTTTATGG AAG AAGTTCAGTTGGG AGTACCTAA A ATTACGG CCTTTTG 

15 TAAACGATTGGCTGATAGAGGCGTGTCATTTAAACGATTACCGATTAAGATAGAGGAGTTCAAGGAGTCGCTAAA 
TGGATAG 

MGIALENVNFTYQEGTPLASAALSDVSLTIEDGSYTALIGHTGSGKSm 

QIRKQVGLVFQFAENQIFEETVLKDVAFGPQNFGVSEEDAVKTAREKLALVGIDESLFDRSPFELSGGQMRRVAIAGILA 
20 MEPAILVLDEPTAGLDPLGRKELMTLFKKLHOSGMTIVLVTHLMDDVAEYANQVYVMEKGRLVKGGKPSDVFQDVV 
FMEEVQLGVPKITAFCKRLADRGVSFKRLPIKIEEFKESLNGZ 

ID 3 6360 bp 

25 TACCCGGTAGTCTrAGCAGACACATCTAGCTCrGAAGATGCTTTAAACATCTCTGATAAAGAAAAAGTAGCAGAA 
AATAAAGAGAAACATGAAAATATCCATAGTGCTATGGAAACrrCACAGGATTTTAAAGAGAAGAAAACAGCAGTC 
ATT A AGG AAA AAG A AG TTGTY AGT A AAAATCCTGTG AT AG A C AAT A AC ACT AG C AATG A AG AAG C AAAAATCAA 
AG AAG AAAATTCCAAT AA ATCCC A AG G AG ATT AT ACGG ACTC ATTTGTG AAT AAAAAC AC AG AAAATC CC AA AAA 
AG AAG AT AAAGTTGTCT ATATTG CTG A ATTT A A AG AT AA AG AATCTGG AG A A A A AGC AATC AAG G A ACT ATCCAG 

3V TCTTA AG AAT AC A AAAGTTTT AT AT ACTT ATG AT AG A ATTTTTAACGGT AGTG CC AT AG AA AC AACTCC AG AT AAC 

TTG G AC AAA ATT AAA C A A AT AG AAGGT ATTTC ATCG GTTG AAAG GG CAC AA A A AGTCC AACC C ATG ATG AATC AT 
GCCAGAAAGGAAATTGGAGTTGAGGAAGCTATTGATTACCTAAAGTCTATCAATGCTCCGTTrGGGAAAAAT^ 
GATGGTAGAGGTATGGTCATTTCAAATATCGATACTGGAACAGATTATAGACATAAGGCTATGAGAATCGATGAT 
G ATG CC A AAG CCTC A ATG A G ATTT A A A A A AG A AG ACTT A A A AGG CACTG AT A AA AATT ATTG GTTG AGTG AT AAA 

33 ATCCCTCATGCGTTCAATTATTATAATGGTGGCAAAATCACTGTAGAAAAATATGATGATGGAAGGGATTATTTTG 
ACCCACATGGGATGCATATTGCAGGGATTCITGCTGGAAATGATACTGAACAAGACATCAAAAACTTTAACGGCA 
TAGATGGAATTGCACCTAATGCACAAATTITCTCTTACAAAATGTATTCTGACGCAGGATCTGGGTTTGCGGGTGA 
TGAAACAATGTTTCATGCTATTGAAGATTCTATCAAACACAACGTTGATGTTGTrrCGGTATCATCTGGTTTTACA 
GGAACAGGTCTTGTAGGTGAGAAATATTGGCAAGCTATTCGGGCATTAAGAAAAGCAGGCATTCCAATGGTTGTC 

40 GCTACGGGTAACTATGCGACITCTGCnTCAAGTTCTTCATGGGATTTAGTAGCAAATAATCATCTGAAAATGACCG 
ACACrGGAAATGTAACACGAACTGCAGCACATGAAGATGCGATAGCGGTCGCTTCTGCTAAAAATCAAACAGTTG 
AGTTTGATAAAGrrAACATAGGTGGAGAAAGTTTTAAATACAGAAATATAGGGGCCTTTTTCGATAAGAGTAAAA 
TCACAACAAATGAAGATGGAACAAAAGCTCCTAGTAAATTAAAATTTGTATATATAGGCAAGGGGCAAGACCAAG 
ATTTGATAGGTTTGGATCTTAGGGGCAAAATTGCAGTAATGGATAGAATTTATACAAAGGATTTAAAAAATGCTTT 

43 TAAAAAAGCTATGGATAAGGGTGCACGCGCCATTATGGTTGTAAATACTGTAAATTACTACAATAGAGATAATTG 
GACAGAGCTTCCAGCTATGGGATATGAAGCGGATGAAGGTACTAAAAGTCAAGTGTTTTCAATTTCAGGAGATGA 
TGGTGTAAAGCTATGG A ACATGATTAATCCTGATAAAAAAA CTG AAGTCAAAAGAAAT AAT AAAG AAG ATTTT AA 
AGATAAATTGGAGCAATACTATCCAATTGATATGGAAAGTTTTAATTCCAACAAACCGAATGTAGGTGACGAAAA 

agagattgactttaagtttgcacctgacacagacaaagaactctataaagaagatatcatcgttccagcaggatc 
3U tacatcttgggggccaagaatagatttacttttaaaacccgatgtttcagcacctggtaaaaatattaaatccacg 
cttaat gtta ttaatggcaaatcaacttatggctatatgtcaggaactagtatggcgactccaatcgtggcagctt 

A CTGTTTTG ATT AG ACCG A A ATT A AAGGAAATG CTTG A A AG A CCTGT ATTG AAA A ATCTT AAG GG A G ATG AC A 
AAATAGATCTTACAAG TCrrA CAAAAATTGCCCTACAAAATACTGCGCGACCTATGATGGATGCAACTTCITGGA 
AAGAAAAAAGTCAAT ACTT TGCATCACCTAGACAACAGGGAGCAGGCCTAATTAATGTGGCCAATGCTTTGAGAA 
33 ATGAAGTTCTAGCAACTTTCA AAAAC ACTGATTCTAAAGGTTTGGTAAA 
AATAAAAGGTGATAAAAAATACirTACAATCAAGCTTCACAATACATCA 

GCATCAGCGATAACTACAGATTCrCTAACTGACAGATTAAAACTTGATGAAACATATAAAGATGAAAAATCTCCA 
GATGGTAAGCAAATTGTTCCAGA AATT CACCCAGAAAAAGTCAAAGGAGCAAATATCACATTTGAGCATGATACT 
TTCACTATAGGCGCAAATTCTAGCTTTGATTTGAATGCGGTTATAAATGTTGGAGAGGCCAAAAACAAAAATAAA 

OU TTTGTAGAAT CATT TATTCATTTTGAGTCAGTGGAAGCGATGGAAGCTCTAAACTCCAGCGGGAAGAAAATAAAC 
TTCCAACCTTCTTTGTCGATGCCTCTAATGGGATTTGCTGGGAATTGGAACCACGAACCAATCCTTGATAAATGGG 
CTTGGGAAGAAGGGTCAAGATCAAAAACACTGGGAGGTTATGATGATGATGGTAAACCGAAAATTCCAGGAACCT 
TAAATAAGGGAATTGGTGGAGAACATGGTATAGATAAATTTAATCCAGCAGGAGTTATACAAAATAGAAAAGATA 
AAAATACAACATCCCTGGATCAAAATCCAGAATTATTTGCTTTCAATAACGAAGGGATCAACGCTCCATCATCAA 

03 GTGGTTCTAAGATTGCTAACATTTATCCTTTAGATTCAAATGGAAATCCTCAAGATGCTCAACTTGAAAGAGGATT 
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AACACCTTCTCCACTTGTATTAAGAAGTGCAGAAGAAGGATTGATTTCAATAGTAAATACAAATAAAGAGGGAG 
AAATCAAAGAGACITAAAAGTCATTTCGAGAGAACACTrTATTAGAGGAATrrTAAATTCT 

AAAGGGAATCAAATCATCTAAACTAAAAGTTTGGGGTGACnTGAAGTGGGATGGACTCATCrATAATCCTAGAGG 
TAGAGAAGAAAATGCACCAGAAAGTAAGGATAATCAAGATCCTGCTACTAAGATAAGAGGTCAATTTGAACCGAT 
5 TGCGGAAGGTCAATATTTCTATAAATTTAAATATAGATTAACTAAAGATTACCCATGGCAGGTTTCCTATATTCCT 
GTAAAAATTGATAACACCGCCCCTAAGATTGTTTCGGTTGA 

AGGATACTTATCATAAGGTAAAAGATCAGTATAAGAATGAAACGCTATTTGCGAGAGATCAAAAAGAACATCCTG 
AAAAATTTGACGAGATTGCGAACGAAGTTTGGTATGCTGGCGCCGCTCTTGTTAATGAAGATGGAGAGGTTGAAA 
AAAATCTTGAAGTAACTTACGCAGGTGAGGGTCAAGGAAGAAATAGAAAACTTGATAAAGACGGAAATACCATTT 

10 ATGAAATTAAAGGTGCGGGAGATTTAAGGGGAAAAATCATTGAAGTCATTGCATTAGATGGTTCTAGCAATTTCA 
CAAAGATTCATAGAATTAAATTTGCTAATCAGGCTGATGAAAAGGGGATGATTTCCTrATTATCrAGTAGATCCTGA 
TC AAG ATTC ATCT A A AT ATC A A AAGCTTG G CG AG ATTG C AG AATCT AA ATTT A AAAATTT AGG AAATGG A A AAG A 
G GG TAGTC TA A AAA AAG ATAC AA CTGG G GT AG AAC ATC ATC ATC A AG AA A ATG AAG AGTCT ATT A A AG A AAAAT 
CT AGTTTT ACT ATTG AT AG AA AT ATT TC A A C AATT AG AG ACTTTG AA A AT AA AG ACTT AAAG AA ACTC ATT A A AAA 

15 GAAATTTAGAGAAGTTGATGATTTTACAAGTGAAACTGGTAAGAGAATGGAGGAATACGATTATAAATACGATGA 
TAAAGGAAATATAATAGCCTACGATGATGGGACTGATCTAGAATATGAAACTGAGAAACTTGACGAAATCAAATC 
AAAAATTTATGGTGTTCTAAGTCCGTCTAAAGATGGACACTITGAAATTCTTGGAAAGATAA 

AATGCCAAGGTATATTATGGGAATAACTATAAATCTATAGAAATCAAAGCGACCAAGTATGATTTCCACTCAAAA 
ACGATGACATTTGATCTATACGCTAATATTAATGATATTGTGGATGGATTAGCTTTTGCAGGAGATATGAGATTAT 
20 TTGTTA AAG AT AATG ATC AG AAAA AAG CTG A AATT A A AATT A G AATG CCTG AAA AAATTAAGG AAACT AAATC A G 

AATATCCCTATGTATCAAGTTATGGGAATGTCATAGAATTAGGGGAAGGAGATCIT^ 

ATTTAACTAAAATGGAATCTGGTAAAATCTATTCTGATTCAGAAAAACAACAATATCTGTTAAAGGATAATATCAT 
TCTAAGAAAAGGCTATGCACTAAAAGTGACTACCTATAATCCTGGAAAAACGGATATGTTAGAAGGAAATGGAGT 
CTATAGCAAGGAAGATATAGCAAAAATACAAAAGGCCAATCCTAATCTAAGAGCCCTTTCAGAAACAACAATTTA 

25 TGCTGATAGTAGAAATGTTGAAGATGGAAGAAGTACCCAATCTGTATTAATGTCGGCTTTGGACGGCTTT^ 

ATAAGGTATCAAGTGTTTACATTTAAj^TGAACGATAAAGGGGAAGCTATCGATAAAGACGGAAATCTTGTGACA 
G ATTCTTCT AAA CTTGT ATT ATTTGGT A AG G ATG AT AAAG A AT A C ACTG G AG AGG AT AAGTTC A ATGT AG AAG CT A 
TAA AAG AAG ATGG CTCCATGTT ATTT ATTG AT A CC A AACC AGT A A AC CTTTC AATGG AT AAG AACT A CTTT AATCC 
ATCT AAATCTAAT AAAATTT ATGT A CGAAATCC A GAATTTT ATTT AAG AGGT AAG ATTTCTG AT AAG G GTGGTTTT 

30 AA CT GGG A ATTG AG AGTT AATG AATCG GTTGT AG AT A ATT ATTT AATCT ACG G AG ATTT AC ACATTG AT AAC ACT A 

GAGATTTTAAT ATT AAGCTGAATGTTAAAGACGGTGACATCATGGACTGGGG AATG AAAGACTATAAAGCAAACG 
GATTTCCAGATAAGGTAA CAGATA TGGATGGAAATGTTTATCTTCAAACTGGCTATAGCGATTTGAATGCTAAAGC 
AGTTGGAGTCCACTATCAGTTTTTATATGATAATGTTAAACCCGAAGTAAACATTGATCCTAAGGGAAATACTAGT 
ATCGAATATGCTGATGGAAAATCTGTAGTCTTTAACATCAATGATAAAAGAAATAATGGATTCGATGGTGAGATT 

35 CAAGAACAACATATTTATATAAATGG AAAA GAATATACATCATTTAATGATATTAAACAAATAATAGACAAGACA 
CTAAACATTAAGATTGTTGTAAAAGATTTTGCAAGAAATACAACCGTAAAAGAATTCATTTTAAA 
GGAGAGGTAAGTG AATT A AAACC TCATAGGGTAACTGTGACCATTCAA AATGG AAAAGAAATGAGTTCAACGATA 
GTGTCGGAAGAAGATTTTATTTTACCTGTTTATAAGGGTGAATTAGAAAAAGGATACCAATTTGATGGT^ 
TTTCTGGTTTCGAAGGTAAAAAAGACGCTGGCTATGTTATTAATCTATCAAAAGATACCrr^ 

40 C AAG A A AAT AG AG G AG AA A AAGG AG G AAG A A A AT A AA C CT A CTTTTG ATGT ATCG AAA AAG AAAG AT A ACCC AC 

AAGTAAACCATAGTCAATTAAATGAAAGTCACAGAAAAGAGGATTTACAAAGAGAAGAGCATTCACAAAAATCT 
G ATTC A ACT A AGG ATGTT AC AG CT A C A GTTCTTG AT A A A A AC A AT ATC AGT AGTA A ATC AA CT A CT A A C AATCCT 
A ATAAG TTGCCAAAAACTGGAACAGCAAGCGGAGCCCAGACACTATTAGCTGCCGGAATAATGTTTATAGTAGGA 
ATTTTTCTTGGATTGAAGAAAAAAAATCAAGATTAA 

45 

YPVVLADTSSSEDALNISDKEKVAENKEKHENIHSAMETSQDFKEKKTAVIKEKEVVSKNPVIDNNTSNEEAKIKEENSN 
KSQGDYTDSFVNKm-ENPKKEDKVVYIAEFKDKESGEKAIKEl^SLKOTKVLYTYDRIFNGSAIETTPDNLDKIKQIEGIS 
SVERAQKVQPMMNHARKEIGVEEAIDYLKSINAPFGKNFDGRGMVISNIDTGTDYRHKAMRJDDDAKASMRFKKEDL 
KGTDKNYWLSDKIPHAFNYYNGGKITVEKYDDGRDYFDPHGMHIAGILAGNDTEQDIKNFNGIDGIAPNAQIFSYKMY 

DU SDAGSGFAGDETMFHAIEDSIKHNVDVVSVSSGFTGTGLVGEKYWQAIRALRKAGIPMVVATGNYATSASSSSWDLVA 
NNHLKMTDTGNVTRTAAHEDAIAVASAKNQTVEFDKVNIGGESFKYRNIGAFFDKSKITTNEDGTKAPSKLKFVYIGK 
GQDQDLIGLDLRGKIAVMDRIYTKDLKNAFKKAMDKGARAIMVVNTVNYYNRDNWTELPAMGYEADEGTKSQVFSI 
SGDDGVKLWNMINPDKKTEVKRNNKEDFKDKLEQYYPIDMESFNSNKPNVGDEKEIDFKFAPDTDKELYKEDIIVPAG 
STSWGPR1DLLUCPDVSAPGKNIKSTLNVINGKSTYGYMSGTSMATPIVAASTVLIRPKLKEMLERPVLKNLKGDDKIDL 

DD TSLTKlALOhTTARPMMDATSWKEKSQYFASPRQQGAGLINVANALRNEVVATFKNTDSKGLVNSYGSISLKEIKGDKK 
YFTIKLHNTSNRPLTFKVSASAITTDSLTDRLKLDETYKDEKSPTCKOIVPEIHPEKVKGANITFEHDTFTIGANSSFDLN 
AVINVGEAKNKNKFVESFIHFESVEAMEALNSSGKKINFQPSLSMPLMGFAGNWNHEPILDKWAWEEGSRSKTLGGYD 
DDGKPKIPGTLNKGIGGEHGIDKFNPAGVIQNRKDKNTTSLDQNPELFAFNNEGINAPSSSGSKIANIYPLDSNGNPQDA 
QLERGLTPSPLVLRSAEEGLISIVNn-NKEGENQRDLKVISREHFIRGlLNSKSNDAKGIKSSKLKVWGDLKWDGLIYNPRG 

OU REENAPESKDNQDPATKIRGQFEP1AEG0YFYKFKYRLTKDYPWQVSYIPVKIDNTAPKIVSVDFSNPEK1KLITKDTYHK 
VKDQYKNETLFARDQKEHPEKFDEIANEVWYAGAALVNEDGEVEKNLEVTYAGEGQGRNRKLDKDGNTIYEIKGAG 
DLRGKIIEV1ALDGSSNFTKIHR1KFANQADEKGMISYYLVDPDQDSSKY0KLGEIAESKFKNLGNGKEGSLKKDTTGVE 
HHHQENEESIKEKSSFTIDRNISTIRDFENKDLKKLIKKKFREVDDFTSETGKRMEEYDYKYDDICGNIIAYDDGTDLEYE 
TEKLDEIKSKIYGVLSPSKDGHFEILGK1SNVSKNAKVYYGNNYKSIEIKATKYDFHSKTMTFDLYANINDIVDGLAFAG 

03 DMRLFVKDNDQKKAEIKIRMPEKIKETKSEYPYVSSYGNVIELGEGDLSKNKPDNLTKMESGKIYSDSEKQQYLLKDNII 
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25 



LRKGYALKVTTYNPGKTDMLEGNGVYSKEDIAKIQKANP 
VFTFKMNDKGEAIDKIXj^VTDSSKLVLFGKDDK^ 

YVRNPEFYLRGKISDKGGFNWELRVNESVVDNYUYGDLHIDmTU)FNK^ 

MDGNVYLQTGYSDLNAKAVGVHYQFLYDNVKPEVNIDPKGOTSIEYADGKSWFNINDKRNNGFDGEIQEQHIYINGK 
EYTSFNDIKQIIDKTLNIKrvVKDFAR^^TVKEHLNKDTGEVSELKPHRVTVTlQNGKEMSSTIVSEEDRLPVYKGELEK 
GYQFDGWE1SGFEGKKDAGYVINLSKDTFKPVFKKIEEKKEEENKPTFDVSKKKDNPOVNHSQLNESHRKEDLQREEH 
SQKSDSTKDVTATVLDKNNISSKSTTNNPNKLPKTGTASGAQTLLAAGIMFIVGIFLGLKKKNQDZ 

ID6 597 bp 



CTTG AATT A AAT A AA AAACGTC ATG CG A CT AAG C ATTTT A CTG AT A A G CTTGTTG ATCCC AAAG ATGTGCGT ACGG 
CTATCGAAATTGCAACCTTAGCGCCAAGCGCCCACAACAGCCAGCCTTGGAAAT7TGTGGTGGTACGTGAGAAAA 
ATGCTGAACTGGCAAAGTTAGCTTATGGTTCCAATTTTGAACAGGTATCATCAGCGCCTGTAACCATTGCCTTG 
TACAGAT ACGGA CTTAGCCAAACGTGCTCGTAAGATTGCCCGTGTTGGTGGTGCTAATAACTTTTCTGAAGAGCAA 
15 CTTCAATATTTrATGAAAAATCTGCCAGCTGAGTTTGCCCGTTACAGTGAGCAACAAGTCAGCGACTAC 
TCAA TGCAGGTTTGGTTGCCATGAAC TTGG TTCrTGCATTGACAGACCAAGGAATTGGTTCTA^ 
TTTTGACAAATCAAAAGTTAATGAAGTTTTGGAAATCGAAGACCGTTTCCGCCCAGAACTCTTGATCACAGTGGGT 
TATACAGACGAAAAATTGGAACCAAGCTACCGCTTGCCAGTAGATGAAATCATCGAGAAAAGATAG 

20 LELNKKRHATXHFTDKLVDPKDVRTAIEIATLAPSAHNSQPWKFVVVREKNAELAKLAYGSNFEQVSSAPVTIALFTDT 
DLAKRARKIARVGGANNFSEEQLQYFMKNLPAEFARYSEQQVSDYLALNAGLVAMNLVLALTDQGIGSNIILGFDKSK 
VNEVLEIEDRFRPELLITVGYTDEKLEPSYRLPVDEIIEKR2 



ID7 1401 bo 



ATGACAGCAATTGATTTTACAGCAGAAGTAGAAAAACGCAAAGAAGACCTCTTGGCTGACTTGTTTAGCCTT^ 
GAAATCAATTCAGAACGTGATGACAGCAAGGCTGATGCCCAGCATCCATTTGGGCCTGGTCCAGTAAAAGCCTTG 
GAGAAATTCCTTGAAATCGCAGACCGCGATGGCTACCCAACTAAGAATGTTGATAACTATGCAGGACATTTTGAG 
TTTGGTGATGGAGAAGAAGTTCTCGGAATCTTTGCCCATATGGATGTGGTGCCTGCTGGTAGCGGTTGGGACACAG 
30 ACCCTTACACACCAACTATCAAAGATGGTCGCCTTTATGCGCGCGGGGCTTCGGACGATAAGGGTCCTACAACAG 
CTTGTTACTATGGTTTGAAAATCATCAAAGAATTGGGT 

AGACGAAGAATCAGGCTGGGCAGACATGGACTACTACTTTGAGCACGTAGGACTTGCCAAACC^ 
CTCACCAGATGCTGAATTTCCAATCAT CAAT GGTGAAAAAGGAAATATCACGGAATACCTCCACTTTGCAGGAGA 
AAATACAGGTGTTGCCCGTCTTCACAGCTTTACAGGTGGTTTACGTGAAAATATGGTACCAGAATCAGCAACAGC 
35 AGTCGTTTCAGGTGACTTGGCTGACTTGCAAGCTAAACTAGATGCCnTTGTTGCAGAACACAAACTTAGAGGAGA 
ACTCCAAGAAGAAGCTGGCAAATACAAGGTGACGATCATTGGTAAATCAGCCCACGGTGCTATGCCTGCTTCAGG 
TGTCAATGGCGCAACTTACCTTGCCCTCTTCCTCAGCCAGTTTGGCTTTGCTGGTCCAGC 

ATC GCAG GTAAAATTCTCnTGAACGATCATGAGGGTGAAAATCTTAAGATTGCTCATGTGGATGAAAAGATGGGT 
GCTCTTTCTATGAATGCCGGCGTCTTCCACnTCGATGAAACAAGTGCTGATAATACCATTGCCCTCAACATCCGCT 
4U ATCCAAAAGGAACAAGTCCAGAACAAATCAAGTCAATCCTTGAAAACTTGCCAGTTGTTTCTGTTAGCCTGTCTGA 
ACACGG TCAC ACGCCTCACTATGTGCCAATGGAAGATCCACrTGTGCAAACCTTGTTGAATATCTATGAAAAACA 
AACTGGCTTTAAAGGTCATGAACAAGTCATCGGTGGTGGAACCTTTGGTCGCTTGCTAGAACGCGGAGTTGCCTA 
CGGTGCTATGTTCCCAGACTCGATTGATACCATGCACCAAGCCAATGAATTTATCGCCTTGGATGATCTTTTCCGA 
GCAGCAGCAATTTATGCCGAAGCTATTTACGAATTGATCAAATAA 

45 

MTAIDFTAEVEKRKEDLLADLFSLLEINSERDDSKADAQHPFGPGPVKALEKFLEIADRDGYPTKNVDNYAGHFEFGD 
GEEVLGIFAHMDVVPAGSGWDTDPYTPTIKDGRLYARGASDDKGPTTACYYGLKIIKELGLPTSKKVRFIVGTDEESGW 
ADMDYYFEHVGLAKPDFGFSPDAEFPHNGEKGNITEYLHFAGENTGVARLHSFTGGLRENMVPESATAVVSGDLADL 
QAKLDAFVAEHKLRGELQEEAGKYKVTIIGKSAHGAMPASGVNGATYLALFLSQFGFAGPAKDYLDIAGKILLNDHEG 

dU enlkiahvdekmgalsmnagvfhfdetsadntialnirypkgtspeoiksilenlpvvsvslsehghtphyvpmedplvq 

TLLNIYEKQTGFKGHEQVIGGGTFGRLLERGVAYGAMFPDSIDTMHQANEFIALDDLFRAAAIYAEAIYELIKZ 
ID8 1617 bp 

55 GTGTATACTATTATAAAATCAAATATAAAAAAATTTAGTTTATTAA 

AATTTATGCAGCAACTATTAATGCTCTGGTGTTGAATGAATTAATTGCGATGAATTTAGAGCGGTTIT^ 

TCAATCTACCAAATGATTGTCTGGTGTGGGATAATATTCCrTGACTGGGTAGTGAAAAATTATCAGGTTGAAGTGA 

TCCAAGAGTTTAATCTAGA GATT CGAAATAGAGTTGCCACAGACATCTCrAACTCTACCTATCAAGAATTTCATAG 

AH TAA^CATCAGGAACATATCTTTCGTGGCTAAATAATGATGTTCAGACTTTAAATGATC 

OU 1 l r i ' l AGT A A T AA A AGG A ATTTCTG GT A CT AT ATTTG C AGTTGTG A CT CTT AATC A CT ATC ATTG GTC ATTG ACTGT 

AGCCACCTTGTTTTCATTAATG ATTATGCT ACITGTACCAAAAATCITTGCATCGAAAATG 
AATTTAACTAACCAAAATGAAGCITTTTT 

TGAATC I i l i ATATGTATTGCCTAAGAAAATTAAAGAAGCAGGAATTTTATTAAAGATGGTTATACAAAGAAAGA 
CAACTGTAGAAACGTTAGCAGGCGCTATTAGCrTCTTTCTCAATA rri 1 ri lM CAGATATCTCTCGTrTTTTTAACA 
OZ> GGCTATCTTGCAATAAAAGGAATAGTGAAAATTGGTACTATTGAAGCAATAGGAGCACTAACAGGTGTrATTTTT 
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ACAGCGCTAGGTGAATTAGGAGGTCAATTATCCTCTATTATTGGTACGAAGCCTArrn i I 1 AAA ATTGTATTCAA 

TTAATCCAATTGAGTCAAATAAAATGAATGATATCGAACCAAATGAGGTGAATAGAGATTTTCCGTTATATGAAG 

CAAAAAATATTTGCTATAj\GTATGGAGATAAAGAAATATTAAAAAACTTAAA'rrrrrG rri'l CAACGTAATGAAAA 

GTATTTAATTTTAGGTGAAAGTGGAAGCGGGAAATCTACATTATTAAAATTATTGAATGGC1 1 II 1GAGAGATTAT 

AGTGGAGAATTGCGATTCTGCGGGGATGATATAAAAAAAACCTCCTATTTAAATATGGTTTCGAATGT^ 

TAGATCAAAAAGCTTATTTGTTTGAAGGTACGATTAGAGATAATATTTTATTGGAAGAAAATTATACTGATGAAGA 

AATACTACAGTCrrTTAGAGCAAGTTGGTTTGAGTGTAAAAGATTTTCCTAATAACATTTTAGATTA 

GATGATGGGAGATTACTGTCAGGAGGGCAGAAACAAAAAATTACTTTAGCTAGAGGGCTAATTAGAAATAAGAA 

AATAGTATTAATTGACGAGGGAACTTCTGCTATCGATAGGAGAACTTCGTTAGCGATTGAACGTAAGATATTAGA 

TAGAGAGGATTTGACTGTCATTATTGTTACCCATGCTCCGCATCCGGAACTTAAACAATATTTTACTAAGATATAT 

C AATTTCC AAAGG A I'l'I'l ATTTAA 

MYTIHCSNIKKFSLLTinVAGQLLLIYAATINALVLNELlAMNLERFLKI^IYOMIVWCGnFLDWVVKNYOVEVIQEFNL 
EIRNRVATDISNSTYQEFHSKSSGTYLSWLNNDVQTLNDQAFKQLFLVIKGISGTIFAVVTLNHYHWSLTVATLFSLMIM 
LLVPKIFASKMREVSLNLTNQNEAFLKSSETILNGFDVLASLNLLYVLPKKIKEAGILIJCMVIQRKTTVETLAGAISFFLNI 
FFQISLVFLTGYLAIKGFVKIGTIEAIGALTGVIFTALGELGGQLSSIIGTKPIFLKLYSINPIESNKMNDIEPNEVNRDFPLYE 
AKNICYKYGDKEILKNLNFCFQRNEKYLILGESGSGKSTLLKLLNGFLRDYSGELRFCGDDIKKTSYLNMVSNVLYVDQ 
KAYLFEGTIRDNILLEENYTDEEILQSLEQVGLSVKDFPNNILDYYVGDDGRLLSGGQKQKITLARGLIRNKKIVLIDEGT 
SAIDRRTSLAIERKILDREDLTVriVTHAPHPELKQYFTKIYQFPKDFIZ 

1D9 705 bp 

ATAACAGTTAAACAGATTATGGACGAAATAGCCGTTTCAGATATGACTGCAAGGCGCTATTTACAGGAATTAGCT 
GATAAAGATTTGCTGATTCGTGTGCATGGTGGAGCTGAAAAACTTCGAACCAACTCC Cri I i GACTAATGAGCG AT 
CAAATATTGAAAAACAAGCCCTCCAAACGGCAGAmAAAACAAGAAATAGCCCATTTTGCAGGCAGTCTAGTAGAA 
GAAj\GAGAAACTATTTTCATTGGACCAGGAACAj\CATTAGAGTTTTTTGCGCGTGAGTTGC 

gcgtcgtaaccaacagtctacctgtttttctgattttaagcgaacgaaaattaacagat^ 

a aatt atcgcgat attacaggtgc it n gttggtacattgaccctacaaaatctctctaatctccaattttctaaa 

gctttcgttagctgtaatggtattcaaaacggagctctagctaciu'iu'agcgaggaagagggagaggctcaacgc 

atcg citt aaat aattct aat aaaaa at attt actcg c ag atc at a gc a agttc a at a agtttg a v 1" i t 1 at acttt 

ttataatgtatcaaatcttgatactattgtttcagattctaaactaagtgattcaatccnttttaagct 

acattaaagtcatcaagccttaa 

itvkqimdeiavsdmtarrylqeladkdllirvhggaeklrtnslltnersniekoalqta£koeiahfagslveereti 

figpgttleffarelpidnirvvtnslpvflilserkltdliliggnyrditgafvgtltlqnlsnlqfskafvscngiqnga 

latfseeegeaqrialnnsnkkylladhskfnkfdfytfynvsnldtivsdsklsdsilfklskhikvikpz 

IDIO 483 bp 

ATGACTGAGTTTTCGTTAGATCITCTTCTAGAAGCCATTAAACT 

AGCTAGACAAAACAGATAAAGACCAAGAGCTTAAAACTGAAATTCAATCCATCTTTATCGAACACAAGGG AAATT 
ATGCTTATCGCCGGGTTCATITAGAACTAAGAAATCGTGGTrATCTGGTAAATCATAAAAGAGTTCAAGGCTTGaT 
GAAAGTACTCAATTTACAAGCTAAAATGCGAAAGAAACGAAAATATTCTTCTCATAAAGGAGACGTTGGTAAGAA 
• GGCAGAGAATCTCATTCAAGCCCAATTTGAAGGCTCTAAAACAATGGAAAAGTGCTACACAGATGTGACTGAATT 
TGCCATTCCAGCAAGTACTCAAAAGCTTTACTTATCACCAGTTTTAGATGGCTT^ 
AATCTTT CTTGTTCG CCT A ATTT A G AAT A A 

MTEFSLDLLLEAIKLARWTYYYHLKQLDKTDKDQELKTEIQSIFIEHKGNYAYRRVHLELRNRGYLVNHKRVOGLMK 
VLNLQAKMRKKRKYSSHKGDVGKKAENLIQAQFEGSKTMEKCYTDVTEFAIPASTQKLYLSPVLDGFNSEIIAFNLSCS 
PNLEZ 

ID14 1266 bp 

CCAGGATTTGGTACCGTTGCAAGTGGTGTGCCTTTCCTCCT AAAGG AAAATGGAGGAAAAATCAATCAATCAGCA 
CA TTCA GA TATC AAAGTTGCTAAGGTATTGGTCAAGGATGAAGATGAAAAAAATCGCTTGCTTGCAGCAGGGAAT 
GACTTTAACnTTGTAACCAATGTGGATGATATTTTATCAGACCAGGATATTACT 

GTATTGAGCCTGCTAAAACCTITATCACTCGTGCCTTGGAAGCTGGAAAACACGTTGTTACTGCrAACAAG 

TTTAGCTGTCCATGGCGCAGAATTGCTAGAAATCGCTCAAGCTAACAAGGTAGCACTTTACTACGAAGCAGCAGT 

TGCTGGTGGGATTCCAATTCTTCGTACTTTAGCAAATTCCTTGGCTTCTGATAAAATTACGCGCGTGCTTG 

GTCAACGGAACTTCCAACTTCATGGTGACCAAGATGGTGGAAGAAGGCTGGTCTTACGATGATGCTCTTGCGGAA 

G CACA ACGTCTAGGATTTGCAGAAAGCGATCCGACGAATGACGTAGATGGGATTGATGCAGCCTACAAGATGGTT 

ATTTTGAGCCAATTTGCCTTTGGCATGAAGATTGCCTTTGATGATGTAGCCCACAAGGGAATCCGCAATATCACAC 

CAGAAGACGTAGCTGTAGCTCAAGAGCTTGGTTACGTAGTGAAATTGGTTGGTTCTATTGAGGAAACTTCTTCAGG 

TA TTGC TGCAGAAGTGACTCCAACCTTCCTACCTAAAGCGCACCCACTTGCTAGTGTGAATGGCGTAATGAACGCT 

GTCTITGTAGAATCTATCGGTATTGGTGAGTCTATGTACTACGGACCAGGTGCGGGTCAAAAACCAACTGCAACA 
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AGTGTTGTAGCTGATATTGTCCGTATCGTTCGTCGTTTGAATGATGGTACTATTGGCAAAGACTTCAACGAATATA 

GCCGTGACTTGGTCnTGGCAAATCCTGAAGATGTCAAAGCAAACTACTATTTCTCAATCTTGGCTCTAGACTCAAA 

AGGTCAGGTCTTGAAGTTGGCTGAAATCTTCAATGCTCAAGATATTTCCTTTAAGCAAATCCTTCAAGAT^ 

GAGGGTGACAAGGCGCGTGTCGTTATCATCACACACAAGATTAATAAAGCCCAGCTTGAAAATGTCTCAGCTGAA 

TTGAAGAAGGTTTCAGAATTCGACCTCTTGAATACCTTCAAGGTGCTAGGAGAATAA 

PGFGTVASGVPFLLKENGGKINQSAHSDIKVAKVLVKDEDEKNRLLAAGNDF^^^ 

AKTFITRALEAGKHVVTANKDLI^VHGAELLEIAQANKVALYYEAAVAGGIPILRTLANSLASDKITRVLGVVNGTSNF 
M\nTCM\^EGWSYT5DALAEAQRLGFA£SDPTNDVIXHDAAYKMVIL£QFAFG 

LGYVVKLVGSIEETSSGIAAEVTPTFLPKAHPLASVNGVMNAVFVESIGIGESMYYGPGAGQKPTATSVVADIVRIVRRL 
NDGTIGKDFNEYSRDLVLANPEDVKANYYFSILALDSKGQVLKLAEIFNAQDISFKQILQDGKEGDKARVVIITHKINKA 
QLENVSAELKKVSEFDLLNTFKVLGEZ 

ID16 1725 bo 

ATGAAACACCTATTATCTTACTTCAAACCCTACATCAAGGAATCAATTTTAGCCCCCTTGTTCAAGCTGTTAGAAG 

CTGTTTTTGAGCTCnTGGTTCCCATGGTGATTGCTGGGATTGTTGACCAATCITTACCT 

TCTCTGGATGCAGATTGGCCTGCTCCTTATCTTTC 

CAGCAAAGGCAGCAGTAGGTTCTGCTAAGGAATTGACAAACGATCITTATC^ 

CAGCAGAGACCGTCTGACAACTTCTAGTTTGGTCACTCGCTTGACTTCGGATACCTACCAGATTCAGACTGGTATC 

AATCAATTCCTGCGTCTCTTTTTACGAGCGCCCATTATCGTTTTTGGTGCCATTTTTATGGCT^ 

TGAGTTGACTTTCTGGTTCTTAGTCTTGGTTGCCATTTTGACCATTGTCATTG 

CTTTCTACAGTAGTCTCAGAAAGAAAACGGACCAACTGGTTCAGGAAACGCGCCAGCAATTGCAAGGGATGCGGG 

TT ATTCGTG Cl'lTl GGTC AAG A A A A ACG AG AGTT AC AG A 1 TlTl CAAACCCTTAACCAAGTTTATGCTAGATTACA 

AGAAAAGACAGGTTTCTGGTCTAGTTTATTAACACCTCTGACCTATCTGATTGTCAATGGAACTCTTCT 

ATCTGGCAAGGCTATATTTCAATTCAAGGAGGAGTGCTCAGTCAAGGTGCTCTCATTGCTCTTATCAATTACCTCT 

TACAGATTTTGGTGGAATTGGTCAAGCTAGCCATGTTGATCAATTCCCTCAACCAGTCCTATATCTCAGTCAAGCG 

AATCGAGGAAGTCTTTGTTGAGGC TCCA GAGGATATCCATTCAGAGTTAGAACAAAAGCAAGCTACCAGAGATAA 

GGTTTTACAAGTCCAAGAATTGACCTTTACCTATCCTGA 

ATGACTCAAGGACAAATTCTAGGTATCATCGGGGGAACTGGTTCTGGTAAATCAAGCTTGGTGCAACTCTTACTT 
GACTTTATCCAGTAGACAAGGGGAACATTGACCTTTATCAAAATGGACGTAGTCCTCTTAATTTGGAGCAGTGGC 
GGTCTTGGATTGCCTATGTACCTCAAAAGGTCGAACTCTTTAAAGGAACCA 

CAATCAAGAAGTATCTGACCAGGAACTCTGGCAGGCCTTGGAGATTGCGCAAGCTAAGGATTTTGTCAGTGAAAA 

GGAAGGACTCTTGGATGCrCTAGTTGAGGCAGGGGGGCGAAATTTCTCAGGTGGACAAAAACAAAGATTGTCTAT 

CGCCCGAGCAGTCTTGCGCCAGGCTCCGTTTCTCATCCTAGATGATGCAACCTCGGCACTGGATACCATTACAGAG 

TCCAAGCTCTTGAAAGCTATTAGAGAAAATTTTCCAAACACGAGCTTAATTTTGATCTCTCAACGAACCT 

TACAGATGGCGGACCAGATTCTCCTCTTGGAAAAAGGTGAGTTGCTAGCTGTTGGCAAGCACGATGACTTGATGA 

AATCCAGCCAAGTCTATTGTGAAATCAATGCATCCCAACATGGAAAGGAGGACTAG 

MKHLLSYFKPYIKESILAPLFKLLEAVFELLVPMVIAGIVDQSLPQGDQGHLWMQIGLLLIFAVIGVLVALLAQFYSAKA 

AVGSAKELTNDLYRHILSLPKDSRDRLTTSSLVTRLTSDTYQIQTGINQFLRLFLRAPIIVFGAIFMAYRISAELTFWFLVL 

VAILTIVrVGLSRLVNPFYSSLRKKTDQLVQETRQQLQGMRVIRAFGQEKRELQIFQTLNQVYARLOEKTGFWSSLLTPL 

TYLIVNGTLLVIIWQGYISIOGGVLSQGALIALINYLLQILVELVKLAMLINSLNQSY1SVKRIEEVFVEAPEDIHSELEQKQ 

ATRDKVLQVQELTFTYPDAAQPSLRYISFDMTQGQILGIIGGTGSGKSSLVQLLLGLYPVDKGNIDLYQNGRSPLNLEQ 

WRSWUYVPQKVELFKGTIRSNLTLGFNOEVSCKJELWQALEIAQAKDFVSEKEGLLDALVEAGGRNFSGGQKQRLS1A 

RAVLRQAPFLILDDATSALDTITESKLLKAIRENFPNTSL1LISQRTSTLQMADQILLLEKGELLAVGKHDDLMKSSQVYC 

EINASQHGKEDZ 

ID18 1224 bo 

ATGAAACGTTCTCTCGACTCAAGAGTCGATTACAGTTTGCTCTTGCCAGTA 1 TlTl T CTACTGGTCATCGGTGTGGT 

GGCTATCTATATAGCCGTTA GTCA TGATTATCCCAATAATATTCTGCCCATTTTAGGGCAGCAGGTCGCCTGGATT 

G CCTTG GGGCTTGTGATTGGTTTTGTGGTCATGCTCTTTAATACAGAA1 1TCIT1 GGAAGCTGACCCCCITTCTATA 

TATTTTAGGCTTGGGACTTATGATCTTGCCGATTGTATTTTATAATCCAAGCTTAGTTGCATCAACGGGTGCCAA^ 

AACTGGGTATCAATAAATGGAATTACCCTATTCCAACCGTCAGAATTTATGAAGATATCCTATATCCTCATGTTGG 

GTCGTGTCATTGTCCAATTTACAAAGAAACATAAGGAATGGAGACGCACGGTTCCGCTGGACTTTTTGTTAATTTT 

CTGGAT GATT CTCTTTACCAT TCCA GTCCTAGTTCTITTAGCACTTCAAAGTGACTTGGGGACG 

TAGCCATTTTCTCAGGAATCGTTTTATTATCAGGGGTTTCTTGGAAAATTATTATCCCAGTATTTGTGACTGCTGT^ 

ACAGGAGTTGCTGG'IM'lCIM'AGCTATCTTTATTAGCAAGGACGGACGAGCri'ri'CU'lCACCAGATTGGAATGCCGA 

CCT ACC AA ATT A ATCGG ATTTTGGCITGG CTC A ATCCCTTTG AGTTTGCCC AAACAACG ACTT ACC AG C AGG CTC A 

AGGGCAGATTGCCATTGGGAGTGGTGGCTTATrTGGTCAGGGATTTAATGCTTCGAATCTGCTTATCCCAGT^ 

GAGTCAGATATGATTTTTACGGTTATTGCAGAAGATTTTGGCTTTA^ 

C ATGTTG ATTT ACCGT ATGTTG AAG A TT ACTCTT A A ATC AA AT A A CC AGTTCT AC ACTT AT ATTTCC AC AGGTTTG A 
TTATG ATGTTG CTCTTCC A C ATCTTTG AG A AT ATCG GTG CTGTG ACTG G A CT ACTTCCTTTG ACGGGG ATTCCCTTG 
CCITTCATTTCGCAAGGGGGATCAGCTATTATCAGTAATCTGATTGGTGTTGGTTTGCTTTTATCGATGAGT^ 
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GACTAATCTAGCTGAAGAAAAGAGCGGAAAAGTCCCATTCAAACGGAAAAAGGTTGTATTAAAACAAATTAAATA 
A 

MKRSLDSRVDYSLLLPVFFLLVIGVVAIYIAVSHDYPNNILPiLGQQVAWIALGLVIGFVVMLFNTEFLWKVT-PFLYILGL 
5 GLMILPIVFYNPSLVASTGAKKWVSINGrTLFQPSEFMKISYILMLARVIVQFTKKHKEWRRTVPLDFL 

VLLALOSDLGTALVFVAIFSGIVLl^GVSWKIHPVFVTAVTGVAGFLAIFISKDGRAFLHQIGMPTYQINRILAWLNPFEF 
AQTTTYQOAQGQlAIGSGGLFGQGFNASNLLIPVRESDMIFrviAEDFGFIGSVLVIALYLMLIYRMLKITLKSNNQFYTY 
ISTGLIMMLLFHIFENIGAVTGLLPLTGIPLPFISQGGSAIISNLIGVGLLLSMSYQTNLAEEKSGKVPFKRKKVVLKQ1KZ 

10 IP22 987 bp 

ATGGTGGCTAAGAAAAAAATCTTATTTTTTATGTGGTCTTTTTCTCTT 

CCATTGTTTCAAATCTGGATCCA GAAA AGTATGATATTGATATTCnTGAAATGGAGCACTTTGACAAGGGATATGA 
ATCTGTTCCAAAGCATGTACGCATTTTAAAATCCCTTCAAG ATTATCG CCAAACCAGATGGTTACGAG L ' l ' i T1T1 G 

15 TGGAGAATGAGAATTTATTTTCCAAGACTGACTCGTCGTTTGCTTGTAAAAGATGATTATGATGTTG 

TTACCATTATGAATCCACCACTGTTGTTCTCTAAAAGAAGAGAAGTCAAGAAGATATCTTGGATTCATGGAAGTAT 
TGAAGAACTTCTTAAGGATAGCTCTAAAAGAGAATCACATAGAAGCCAGTTGGATGCTGCGAATACAATTGTAGG 
GATTTCAA AAAA GACCAGCAATTCTATCAAGGAAGTTTATCCAGATTATACTTCTAAATTACAGACAATCTACAAT 
GGATATGATTTTCAGACTATTCTAGAAAAATCTCAAGAGAAGATCGATATCGAGATTGCTCCTCAAAGTATCTGTA 

20 CTATCGGACGGATTG AGGAA AATAAGGGTTCTGACCGTGTAGTGGAAGTGATACGATTATTACACCAAGAGGGAA 
AAAACTATCATCTCTATTTTATCGGGGCTGGTGATATGGAAGAGGAACTGAAAAAACGAGTCAAAGAGTATGGGA 
TTGAGGACTATGTACATTTCCTTGGTTATCAAAAAAATCCTTATCAGTATCTATCTCAGACGAAAGTTCTTTTGTCT 
ATGTCTAAACAAGAAGGTTTTCCTGGAGTGTATGTGGAGGCCTTGAGTCTGGGACTCCCITTTATCT 
TTGGAGGGGCrGAGGAATTATCCCAAGAAGGACGATTTGGACAAATCATTGAGAGCAATCAAGAGGCAGCTCAG 

25 GCGATTACTAATTACATGACTTCTGCCTCAAACTTTGATGTCGATGAGGCTAGCCAATTCATTCAACAA 
TTACAAAACAAATCGAACAAGTAGAAAAACTATTAGAGGAGTAG 

MVAKKKILFFMWSFSLGGGAEKILSTrvS^DPEKYDIDILEMEHFDKGYESVPKHVRIUCSLQDYRQTRWLRAFLWRM 
RIYFPRLTRRLLVKDDYDVEVSFTIMNPPLLFSKRREVKKlSWIHGSIEELLKDSSKRESHRSQLDAAhO*rVGISKKTSNSIK 
30 EVYPDYTSKLQTIYNGYDFQTILEKSQEKIDIEIAPQSICTIGRIEENKGSDRVVEVIRLLHQEGKNYHLYFIGAGDMEEEL 
KKRVKEYGIEDYVHFLGYQKNPYQYLSQTKVLLSMSKQEGFPGVYVEALSLGLPnSTDVGGAEELSQEGRFGQIIESNQ 
EAAQAITNYMTSASNFDVDEASQFIQQFTITKOIEOVEKLLEEZ 



35 



ID23 1434 bp 



ATGGAAACTGCATTAATTAGTGTGATTGTGCCAGTCTATAATGTGGCGCAGTACCTAGAAAAATGGATAGCTTCCA 
TTCAGAAGCAGACCTATCAAAATCTGGAAATTATTCnTGTTGATGATGGTGCAACAGATGAAAGTGGTCGCT^ 
TGATTCAATCGCTGAACAAGATGACAGGGTGTCAGTGCTTCATAAAAAGAACGAAGGATTGTCGCAAGCACGAAA 
TGATGGGATGAAGCAGGCTCACGGGGATTATCTGATTTTTATTGACTCAGATGATTATATCCATCCAGAAATGATT 

40 CAGAGCTTATATGAGCAATTAGTTCAAGAAGATGCGGATGT7TCGAGCTGTGGTGTCATGAATGTCTATGCTAATG 
ATGAAAGCCCACAGTCAGCCAATCAGGATGACTATTITGTCTGTGATTCTCAAACATTTCTAAAGGAATACCTCAT 
AGGTGAAAAAATACCTGGGACGATTTGCAATAAGCTAATCAAGAGACAGATTGCAACTGCCCTATCCTTTCCTAA 
GGGGTTGATTTACGAAGATGCCTATTACCATTTTGATTTAATCAAGTTGGCCAAGAAGTATGTGGTTAATACTAAA 
CCCTATTATTACTA TTTCC ATAGAGGGGATAGTATTACGACCAAACCCTATGCAGAGAAGGATTTAGCCTATATTG 

45 ATATCTACCAAA AGTT TTATAATGAAGTTGTGAAAAACTATCCTGACTTGAAAGAGGTCG C ' l ' i ' rrrr CAGATTGGC 

CTATG CCCACl l'CrriATTCTGG ATAA GATGTTGCTAGATGATCAGTATAAACAGTTTGAAGCCTATTCTCAGATT 

CATCGi mi i AAAAGGCCATGCCTTTGCTATTTCTAGGAATCCAATTTTCCGTAAGGGGAGAAGAATTAGTGCTT 

TGGCCCTATTCATAAATATTTCCTTATATCGATTCTTATTACTGAAAAATATTGAAAAATCTAAAAAATTACATTA 
G 

50 

METALISVIVPVYNVAQYLEKSIASIQKOTYQNLEIILVDDGATDESGRLCDSIAEQDDRVSVLHKKNEGLSOARNDGM 
KOAHGDYLIFIDSDDYIHPEMIQSLYEQLVQEDADVSSCGVMNVYANDESPQSANODDYFVCDSQTFLKEYLIGEKIPG 
TICNKLIKRQIATALSFPKGLIYEDAYYHFDLIKLAKKYVVNTKPYYYYFHRGDSITTKPYAEKDLAYIDIYQKFYNEVV 

KNYPDLKEVAFFRLAYAHFFILDKMLLDDQYKQFEAYSQIHRFLKGHAFAISRNPIFRKGRRISALALFINISLYRFLLLK 
55 NIEKSKKLHZ 

ID24 735bp 

60 ATGAGAATCAAAGAGA AAACC AATAATATTAATGGAGGAATAAAAAATGTAAGTAAGCATTATGGTCATTCAATC 
ATTCTCAAAGATATAAATTTTGCACTTAACAAGGGTGAAATTGTTGGTCTAGCAGGGAGAAATGGAGTTGGTAAG 
AGTACGTTGATGAAAATTCTTGT TCAGA ATAATCAACCGACTTCAGGTAATATTATAAGCAGTGATAATGTTGGGT 
ATTTAATCGAAGAACCAAAATTATTTTTATCTAAAACAGGTTTAGAGAATTTAAAATATTTGTCAAA 
TGTTGACTAC AATC AAGAAAGATTTAGATGTTTGATCCAAGAGTTAGATTTGACTCAGTCTATTAATAAAAAAGTA 

OD AAGACCTATTCTTTGGGTACAAAACAAAAATTAGCTTTGCTTCTAACTCTCGTTACGGAACCT^ 
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TAGATGAACCGACTAATGGTTTAGATATTGAATCATCACAAATAG1 

TGAAAATGTGGGAATTTTAATATCGAGTCATAAATTAGAAGACATTGAAGAAATTTGTGAGAGAGTTC1 MTCITG 
GAGA ACGGGC f I TTGACATTTCAAAAAGTAGG AAAAGATAGTCATAA IT I C 1 1 GTTTGAGATAGCTTTTTCATCAG 
CTACAGATAGAGACATTTTCATTACCAAACAAGAATTTTGGGATATTGTTTAG 

5 

MRIKEKTNNINGGKKVSKHYGHSIIIJCDINFALN^ 
KLFI^KTGLENLKYI^NLYGVDYNQERFRCUQELDLTQSINKKVKT^^ 

SSQIVLAVLKKLALHENVGILISSHKLEDIEEICERVLFLENGLLTFQKVGKDSHNFLFEIAFSSATDRDIFITKQEFWDIVZ 
10 IP25 1704bp 

ATGACTGAATTAGATAAACGTCACCGCAGTAGCATTTATGACAGCATGGTTAAATCACCTAACCGTGCTATGCTTC 
GTGCGACTGGTATGACAGATAAGGACTTTGAAACATCGATTGTGGGAGTGATTTCGACTTGGGCGGAAAATACAC 
CATGTAACATTCACTTGCATGATTTCGGGAAACTGGCTAAAGAAGGTGTCAAATCTGCAGGCGCTTGGCCTGTAC 

15 AGTTTGGAACCATTACCGTAGCGGACGGGATCGCTATGGGAACGCCTGGTATGCGTTTCTCTCTAACATCTCGTGA 
CATCATCGCGGACTCCATCGAGGCGGCTATGAGTGGTCACAACGTGGATGCCTTCGTCGCTATCGGTGGCTGTGA 
CAAGAACATGCCTGGATCTATGATTGCTATTGCTAATATGGATATCCCAGCTATTTTCGCCrATGGTGGAACT 
GCACCGGGAAATCTTGATGGTAAAGATATCGACTTGGTTTCTGTCTTTGAAGGTATCGGAAAATGGAACCACGGT 
GACATGACAGCTGAGGACGTGAAACGTCTTGAATGTAATGCCTGCCCTGGCCCTGGTGGTTGTGGTGGTATGTAT 

20 ACTGCTAATACCATGGCAACTGCTATCGAAGTTCTAGGGATGAGTTTGCCAGGGTCATCCTCTCACCCAGCTGAAT 
CAGCTGATAAGAAAGAAGATATCGAAGCAGCAGGACGTGCTGTTGTTAAGATGTTGGAACTTGGTCTCAAACCAT 
CAGATATCTTGACTCGTGAAGCCTTTGAj\GATGCTATCACTGTAACGATGGCTCTCGGTGGTTCTACAAACGCCA 
TCTTCACTTGCTCGCCATTGCCCATGCCGCAAATGTTGACTTGTCACTTGAGGACTTCAATACGATTCAAGAACGT 
GTGCCTCACTTGGCCGACTTGAAACCATCTGGTCAGTATGTCTTCCAAGACCTCTACGAAGTCGGTGGTGTCCCTG 

25 CGGTTATGAAGT ATTT GTTGGCAAATGGTTTCCnTCACGGAGATCGCATCACATGTACTGGTAAGACTGTAGCTGA 
AAACTTGGCTGACTTTGCAGACTTGACTCCAGGCCAAAAAGTTATCATGCCACTTGA^^ 

TGGTCCGCTTATCATCTTGAACG GGAA CCrTGCTCCTGACGGTGCAGTTGCCAAGGTATCAGGTGTTAAAGTGCGT 
CGTCACGTTGGGCCAGCTAAGG TCTTT GACTCAGAAGAAGATGCGATTCAGGCCGTTCTGACAGATGAAATCGTT 
GATGGCGATGTAGTCGTTGTTCGTITTGTTGGACCTAAAGGTGGTCCTGGTATGCCT 
30 AATGATTGTTGGTAAAGGTCAGGGAGATAAGGTGGCCCTCTTGACGGACGGACGTTTCTCTGGTGGTACTTATGGT 
CTGGTTGTTGGACATATCGCTCCTGAAGCTCAGGATGGTGGACCAATTGCCTATCTCCGTACCGGCGATATCGTTA 
CGGTTGACCAAGATACCAAAGAAATTTCTATGGCCGTATCCGAAGAAGAACTTGAAAAACGCAAGGCAGAAACA 
ACCTTGCCACCACTTTACAGCCGTGGTGTCCTCGGTAAATATGCCCACATCGTATCATCTGCTTCACGCGGAGCCG 
TGACAGACTTCTGGAATATGGACAAGTCAGGTAAAAAATAA 

35 

MTELDKRHI^SIYDSMVKSPNRAMLRATGMTDKDFETSIVGVISTWAEOTPCNIHLHDFGKI^KEGVKSAGAWPVQ 
TITVADG1AMGTPGMRFSLTSRDIIADSIEAAMSGHNVDAFVAIGGCDKNMPGSM1ALANMDIPAIFAYGGTIAPGNLDG 
KDIDLVSVFEGIGKWNHGDMTAEDVKRLECNACPGPGGCGGMYTAOTMATAIEVLGMSLPGSSSHPAESADKKEDIE 
AAGRAVVKMLELGLKPSDILTREAFEDAITVTMALGGSTNATLHLLAIAHAANVDLSLEDFNTIOERVPHLADLKPSGQ 
40 YVFQDLYEVGGVPAVMKYLLANGFLHGDRITCTGKTVAENLADFADLTPGQKVIMPLENPKRADGPLIILNGNLAPDG 
AVAKVSGVKVRRHVGPAKVFDSEEDAIQAVLTDEIVDGDVVVVRFVGPKGGPGMPEMLSLSSMIVGKGQGDKVALLT 
DGRFSGGTYGLVVGHIAPEAQDGGP1AYLRTGDIVTVDQDTKEISMAVSEEELEKRKAETTLPPLYSRGVLGKYAHIVSS 
ASRGAVTDFWNMDKSGKKZ 

45 ID26 274bp 

ATGTTATAATAAAAATAAAGAATTTAAGGAGAAATACAATATGTCAATTTTTATTGGAGGAGCATGGCCATATGC 
AAACGGTTCGT TACAT ATTGGTCACGCGGCAGCGCTTTTACCGGGGGATATTCTTGCAAGATACTATCGTCAGAA 
GGGAGAGGAAGTTTTATATGTTTCTGGAAGTGATTGTAATGGAACCCCTATTTCTATCAGAGCTAAAAAAGAAAA 
50 TAAGTCTGTGAAAGAAATTGCTGATTTTTATCATAAGGAATTTAATCCA 

CYNKNKEFKEKYNMSIFIGGAWPYANGSLHIGHAAALLPGDILARYYRQKGEEVLYVSGSDCNGTPISIRAKKENKSVK 
E1ADFYHKEFNP 

55 ID28 1065bo 

ATGACAACATTATTTTCAAAAATTAAAGAAGTAACAGAACTTGCTGCAGTCTCAGGTCATGAAGCGCCTGTCCGT 

GCTTATCTTCGTGAAAAGTTGACACCGCATGTGGATGAAGTGGTGACAGATGGCTTGGGTGGTATTTTTGGTATC^ 

AACATTCAGAAGCTGTGGATGCACCGCGCGTCTTGGTCGCTTCTCATATGGACGAAGTTGGTTTTATGGTCAGCGA 

60 AATCAAGCCAGATGGTACCTTCCGTGTCGTAGAAATCGGTGGCTGGAACCCCATGGTGGTTAGCAGCCAACGTTT 
CAAACTCTTGACTCGTGATGGTCATGAAATTCCTGTGATTTCAGGTTCTGTTCCTCCGCATTTGACTCGTGGA 
GGGGGACCAACCATGCCAGCCATTGCCGATATCGTTTTTGATGGTGGTTTTGCGGACAAGGCTGAGGCAGAAAGT 
TTTG G C ATCCGTC CTG GTG AT ACC ATTGT ACC AG AT AGTTCTG C AATTTTG A C AG CC A ATG AAAA AA AT ATC ATCT 
CAAAAGCTTGGGATAACCGCTACGGTGTCCTCATGGTAAGCGAGCTAGCTGAAGCTTTATCGGGTCAAAAACTCG 

65 GCAATGAACrCTATCTGGGTTCTAACGTCCAAGAAGAAGTTGGTCTGCGTGGCGCTCATACCTCTACAACCAAGTT 
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TGACCCAGAAGTCTTCCTCGCAGTTGATTGCTCACCAGCAGGTGATGTCTACGGTGGTCAAGGCAAGATTGGAGA 
TGGAACCTTGATTCGTTTCTATGATCCAGGTCACTTGCrT 

GAAGAAGCTGGTATCAAGTACCAATACTACTGTGGTAAAGGCGGAACAGATGCAGGTGCAGCTCATCTGAAAAAT 
GGTGGTGTCCCATCAACA ACTA TCGGTGTCTGCGCTCGTTATATCCATTCTCACCAAACCCTCTATGCAATGGATG 
ACTTCCTAGAAGCGCAAGCri^CIU ACAAGCCTTGGTGAAGAAATTGGATCGTTCAACGGTTGATTTGATTAAACA 
TTATTAA 

MTTLFSKIKEVTELAAVSGHEAPVRAYLREKLTPHVDEVVTDGLGGIFGDCHSEAVDAPRVLVASHMDEVGFMVSEIKP 

DGTFRWEIGGWNPMVVSSQRFKLLTRDGHEIPVISGSVPPHLTRGKGGPTMPAIADIVFDGGFADKAEAESFGIRPGDT 

IVPDSSAILTANEKNIISKAWDNRYGVLMVSELAEALSGQKLGNELYLGSNVQEEVGLRGAHTSTTKFDPEVFLAVDCS 

PAGDVYGGQGKIGDGTLIRFYDPGHLLLPGMKDFLLTTAEEAGIKYQYYCGKGGTDAGAAHLKNGGVPSTTIGVCARY 

IHSHQTLYAMDDFLEAQAFLQALVKKLDRSTVDLIKHYZ 

ID31 1182bp 

ATGGAATTTTCTATGAAATCAGTCAAAGGACTACrCTTTATCATAGCTAGTTTT^^ 
GAACACTTCTCCCCAATTCATGATTCCAGGACrAGCTTTAAC^ 

CTCCC ACT A CT AG AAAG CTGGTTTCA C AGTTTG G A G A AGGTCT AC AC CGTC C A C AA ATTC AC AG CCTTTCTCTC A A 
TCATCCTACTA ATCT TTCATAACTTTAGTATGGGCGGTTTGTGGGGCTCTCGCTTAGCTGCT 

GCCATCTATATCTTTGCCAGCATCATC CTTG TCGCCTATTTAGGCAAATACATCCAATACGAAGCTTGGCGATGGA 

TTCACCGCCTGGT TTACC rAGCCTATATTTTAGGACTCTTTCACATCTACATGATAATGGGCAATCG 

TTT A ATCTTCT AAG' 1 'i'i'l CTJ GTTGGT AGCT ATG CCCTTTT AGGCTT ACT AG CTGGTTTTT AT ATCATTTTTCTATAT 

CAAAAGATTTCCITCC CCTA TCTAGGGAAAATTACCCATCTCAAACGCITAAATCACGATACTAGAGAAATTCAA 

ATCCATCTTAGCAGACCTTTCAACTATC^ 

GTGCTCCGCATCCCITTTCTATCrCAGGAGGTCATGGTCAAACTCTTTACTrr^ 

T ACC AAG AAT ATCT ATG AT A ATCTTC AAG C CGG C AG C AAAGT AACCCT AG AC AG AGCTT ACG G A C AC ATG ATC AT 

AGAAGAA GGACG AGAAAATCAGGTTTGGATTGCTGGAGGTATTGGGATCACCCCCTTCATCTCTTACATCCGTGA 

ACATCCTATTTTAGATAAACAGGTTCACTTCTACTATAGCTTCCGTGGAGATGAAAATGCAGTCTACCT 

CTCCGTAACTATGCTCAGAAAAATCCTAATTTTGAACTCCATCTAATCGACAGTACGAAAGACGGCTATCTTAATT 

TTGAACAAAAAGAAGTGCCCGAACATGCAACCGTCTATATGTGTGGTCCTATTTCTATGATGAAGGCACTTGCCA 

AACAGATTAAGAAACAAAATCCAAAAACAGAGCATATTTAC 

MEFSMKSVKGLLFIIASFILTLLTWMOTSPQFMIPGLALTSLSLTHL^ 

NFSMGGLWGSRLAAQFGNLAIYIFASIILVAYLGKYIQYEAWRWIHRLVYLAYILGLFHIYMIMGNRLLTFNLLSFLVGS 
YALLGLLAGFYIIFLYQKISFPYLGKITHLKRLNHDTREIQIHLSRPFNYQSGQFAFLKIFQEGFESAPHPFSISGGHGQTLY 
FTVKTSGDHTKNIYDNLQAGSKVTLDRAYGHMIIEEGRENQVWIAGGIGITPFISYIREHPILDKQVHFYYSFRGDENAV 
YLDLLRNYAQKNPNFELHLIDSTKDGYLNFEQKEVPEHATVYMCGPISMMKALAKQIKKQNPKTEHIY 

ID32 9Q0bp 

ATG AC l iriAAATCAGGCrTTGTAGCCATTTTAGGACGTCC^ 

TGGGGCAAAAGATTGC CATC ATGAGTGACAAGGCGCAGACAACGCGCAATAAAATCATGGGAATTTACACGACTG 

ATAAGGAGCAAATTGTCTTTATCGACACACCAGGGATTCACAAGCCTAAAACAGCTCTCGGAGATTTCATGGTTG 

AGTCTGCCTACAGTACCCITCGCGAAGTGGACACTGTTCTTTTCATGGTGCCrc 

GGACGATATGATTATCGAGCGTCrCAAGGCTGCCAAGGTTCCTGTGATTTTGGTGGTGAATAAAATCGATAAGGTC 

CATCCAGACCAGCTCTTGTCrCAGATTGATGACTTCCGTAATCAAATGGACTTTAAGGAAATTGTTCCAATCTCAG 

CCCTTCAGGGAAATAACGTGTCrCGTCTAGTGGATATTTTGAGTGAAAATCTGGATGAAGGTTTCCAATATTTCCC 

GTCTGATCAAATCACAGACCATCCAGAACGTTTCTTGGTTTCAGAAATGGTTCGCGAGAAAGTCTTGCACCrAACT 

CGTGAAGAGATTCCGCATTCTGTAGCAGTAGTTGTTGACTCTATGAAACGAGACGAAGAGACAGACAAGGTTCAC 

ATCCGTGCAACCATCATGGTCGAGCGCGATAGCCAAAAAGGGATTATCATCGGTAAAGGTGGCGCTATGCTTAAG 

AAAATCGGTAGCATGGCCCGTCGTGATATCGAACTCATGCTAGGAGACAAGGTCTTCCTAGAAACCTGGGTCAAG 

GTCAAGAAAAACTGGCGCGATAAAAAGCTAGATTTGGCTGACTTTGGCTATAATGAAAGAGAATACTAA 

MTFKSGFVAILGRPNVGKSTFLNHVMGQKIAIMSDKAOTTRNKIMGIYTTDKEQIVFIDTPGIHKPKTALGDFMVESAYS 
TLREVDTVLFMVPADEARGKGDDMIIERLKAAKVPVILVVNKIDKVHPDQLLSQIDDFRNQMDFKEIVPISALQGNNVS 
RLVDILSENLDEGFQYFPSDQITDHPERFLVSEMVREKVLHLTREErPHSVAVVVDSMKRDEETDKVHIRATIMVERDSQ 
KGIIIGKGGAMLKKIGSMARRDIELMLGDKVFLETWVKVKKNWRDKKLDLADFGYNEREYZ 

ID33 855bo 

CTGCl IL1 JGTTTTrACAGAAGGAGCACTTATGCCTGAATTACCT 

AATTGATTATAGGAAAGAAGATTTCGAGTATAGAAATTCGCTACCCCAAGATGATTAAGACGGATTTGGAAGAGT 
TTCAAAGGGAATTGCCTAGTCAGATrATCGAGTCAATGGGACGTCGTGGAAAATATTTGCrTTTTTATCTGACAGA 

caaggtc ttgatttc c cattt gcggatggagggcaagtatttttactatccagaccaaggacctgaacgcaagcat 
gcccatgt 1 1 1 c 1 1 icattttgaagatggtggcacgcttgtttatgaggatgttcgcaagtttggaaccatggaac 
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TCTTGGTGCCTGACCTTTTAGACGTCTACTTTAT 
tl TTTACAGGTCTTTCAATCTGCCCTTGCCAAGTC^ 

GCTGGACTTGGCAATATCTATGTGGATGAGGTTCTCTGGCGAGCTCAGGTTCATCCAGCTAGACCTTCCCAGACTT 
TGACAGCAGAAGAAGCGACTGCCATTCATGACCAGACCATTGCTGTTTTGGGCCAGGCTGTTGAAAAAGGTGGCT 
5 CCACCATTCGGACTTATACCAATGCCTTTGGGGAAGATGGAAGCATGCAGGACTTTCATCAGGTCTATGATAAGA 
CTGGTCAAGAATGTGTACGCTGTGGTACCATCATTGAGAAAATTCAACTAGGCGGACGTGGAACCCACri'riGTCC 
AAACTGTCAAAGGAGGGACTGA 



MLLVFTEGGLMPELPEVETVCRGLEKLnGKKISSIEIRYPKMIKTDLEEFQRELPSQIIESMGRRGKYLLFYLTDKVUSHL 
1 0 RMEGKYFYYPDQGPERKHAHVFFHFEIXjGTLVTEDVRKFGTM \ 

K5KKPIKSHLLDQTLVAGLGNIYVDE\O.WRAQVHPARPSOTLTAEEATAIHIX?TIAVLGQAVEKGGSTIRTYTNAFGED 
GSMQDFHQVYDKTGQECVRCGTIIEKIQLGGRGTHFCPNCQRRDZ 

ID34 633bp 

15 

TTGTCCAAACT GTCAA AGGAGGGACTGATGGGAAAAATCATCGGAATCACTGGGGGAATTGCCTCTGGTAAGTCA 
ACTGTGACAAATTTTCTAAGACAGCAAGGCTTTCAAGTAGTGGATGCCGACGCAGTCGTCCACCAACrACAGAAA 
CCTGGTGGTCCTCTGTTTGAGGCTCTAGTACAGCACTITCG 

GCCCTCTCCTAGCTAGTCTCATCTTTTCAAATCCTGATGAACGAGAATGGTCTAAGCAAATTCAAGGGGAGATTAT 
20 CCGTGAGGAACTGGCTACTTTGAGAGAACAGTTGGCTCAGACAGAAGAGA' l ' rrrc rT CATGGATATTCCCCTACTT 

TTTGAGCAGGACTACAGCGATTGGTTTGCTGAGACrTGGTTGGTCTATGTGGACCGAGATGCCCAAGTGGAACGC 
TTAATGAAAAGGGACCAGTTGTCCAAAGATGAAGCTGAGTCTCGTCTGGCAGCCCAGTGGCCTTTAGAAAAAAAG 
AAAGATTTGGCCAGCCAGGTTCTTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATCCTTCT 
AGGGAGGTAGGCAAGATGACAGAGATTAA 

25 

MSKI^KEGLMGKIIGITGGlASGKSTVTNFLRQQGJHJVVDADAVVHQLQKPGGRLFEALVQHFGOEnLENGELNRPLL 

ASLIFSNPDEREWSKQI(^EUREELATLREQLAQTEEIFFMDIPLLFEQDYSDWFAETWLVYVDRDAQVERLMKRDQLS 

KDEAESRLAAQWPLEKKKDLASQVLDNNGNQNQLLNQVHILLEGGRQDDRDZ 

30 ID35 1269bo 

TTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATCCTTCTTGAGGGAGGTAGGCAAGATGACA 
GAG ATTAA CTGGAAGGATAATCTGCGCATTGCCTGGTTTGGTAATTTTCTGACAGGAGCCAGTA rr iC ' rri 'GG 
TACCrilU'ATGCCCATCTTCGTGGAAAATCTAGGTGTAGGGAGTCAGCAAGTCGCTTTTTATGCAGGCTTAGCAAT 

35 TTCTGTCTCTGCTATTTCCGCGGCGCTCnTTTCTCCTATTTGGGGTATTCTTGCT 

TGAT GATTCGGGCAGGTCTTGCTATGACTATCACTATGGGAGGCrTGGCCTTTGTCCCAAATATCTAT^ 

CT I i CI 1 CGTTTACTAAACGGTGTATTTGCAGG ITH GT1 CCTAATGCAACGGCACTGATAGCCAGTCAGGTTCCA 

AAGG AGAA ATCAGGCTCTGCCTTAGGTACTTTGTCTACAGGCGTAGTT 

GTGGCTTTAT CGCA GAATTATTTGGCATTCGTACAGTITICITACT^ 

40 TGACTATTTGCTTTATCAAGGAAGATTTTCAACC^ 

CTCGGT TAAA TATC CCTA TCTTlTGCrCAATCrCTTTTTAACCAGTTTTG 
GCCCTATTTTGGCTCTTTATGTACGCGACTTAGGGCA 

AGTATGGGCTTTTCC AGCA TGATGAGTGCAGGAGTCATGGGCAAGCTAGGTGACAAGGTGGGCAATCATCGTCTC 
TTGGTTGTCGCCCAGTTTTATTCAGTCATCATCTATCTCCTCTGTGCCAATGCCTCT 
45 CTATCGTTTCCTCTTTGGATTGGGAACCGGTGCCTTGATTCCCGGGGTTAATGCCCTACT 

AAAGCCGGCATTTCGAGGGTCTTTGCCTTCAATCAGGTATTCTTTTATCTGGGAGGTGTTGTTGGTCCCATGGC 

GTTCTGCAGTAGCAGGTCAATTTGGCTACCATGCTGTCTTTTATGC^ 

TTTAACCTGATTCAATTTCGAACATTATTAAAAGTAAAGGAAATCTAG 

50 MIIMAIRTSFLIKCISFLREVGKMTEINWKDNLR1AWFGNFLTGASISLVVPFMPIFVENLGVGSQQVAFYAGLAISVSAIS 
AALFSPIWGILADKYGRKPMMIRAGLAMTITMGGLAFVPNIYWLIFLRLLNGVFAGFVPNATALIASQVPKEKSGSALG 
TLSTGVVAGTLTGPFIGGFIAELFGIRTVFLLVGSFLFLAAILTICFIKEDFQPVAKEKAIPTKELFTSVKYPYLLLNLFLTS 
FVIQFSAQSIGPILALYVRDLGQTENLLFVSGLIVSSMGFSSMMSAGVMGKLGDKVGNHRLLVVAQFYSVIIYLLCANAS 
SPLQLGLYRFLFGLGTGALIPGVNALLSKMTPKAGISRVFAFNQVFFYLGGVVGPMAGSAVAGQFGYHAVFYATSLCV 

55 AFSCLFNLIQ FRTLLK VKEIZ 

ID36 1311 bp 



ATGGCCCTACCAACTATTGCCATTGTAGGACGTCCCAATGTTGGGAAATCAACCCTATTTAATCGGATCGCTGGTG 

AGCGAATCTCCATTGTAGAAGATGTCGAAGGAGTGACACGTGACCGTATTTATGCAACGGGTGAGTGGCTCAATC 

GTTCTTTTAGCATGATTGATACAGGAGGAATTGATGATGTCGATGCTCCTTTCATGGAACAAATCAAGCACCAGGC 

AGAAATTGCCATGGAAGAAGCAGATGTTATCGTTTTTGTCGTGTCTGGTAAGGAAGGAATTACTGATGCAGACGA 

ATACGTAGCTCGTAAGCTTTATAAGACCCACAAACCAGTTATCCTCGCAGTCAACAAGGTGGACAACCCTGAGAT 

GAGAAATGATATATATGATTTCTATGCTCTCGGTTTGGGTGAACCATTGCCTATCTCATCTGTCCATGGAATCGGT 

ACAGGGGATGTGCTAGATGCGATCGTAGAAAATCTTCCAAATGAATATGAGGAAGAAAATCCAGATGTCATTAAG 

TTT A G CTTG ATTGGTCGTCCT A ACG TTGG AA A ATC AAG CTTG ATC A ATG CT ATCTTG GG AG AAG A CCGTGTT ATTG 
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CTAGTCCTGTTGCTGGAACAACTCGTGATGCCATTGATACCCACTTTACAGATACAGATGGTCAAGAGTTTACCAT 
GATTGATACGGCTGGTATGCGTAAGTCTGGTAAGGTTTATGAAAATACTGAGAAATACTCTGTTATGCGTGCCATG 
CGTGCTATTGACCGTTCAGATGTGGTCTTGATGGTCATCAATGCGGAAGAAGGCATTCGTGAGTACGACAAGCGT 
ATCGCAGGATTTGCCCATGAAGCTGGTAAAGGGATGATTATCGTGGTCAACAAGTGGGATACGCTTGAAAAAGAT 
5 AACCACA CTAT GAAAAACTGGGAAGAAGATATCCGTGAGCAGTTCCAATACCTGCCTTACGCACCGATTATCTTT 
GTATCAGCITTAACCAAGCAACGTCTCCACAAACTTCCTGAGATGATTAAGCAAATCAGCGAAAGTCAAAATACA 
CGTATTCCATCAG CTGTCT TGAACGATGTCATCATGGATGCCATTGCCATCAACCCAACACCGACAGACAAAGGA 
AAACGTCTCA AGATT TTCTATGCGACCCAAGTGGCAACCAAACCACCAACCTTTGTCATCTTTGTCAATGAAGAAG 
AACTCATGCACTTTTCTTACCrGCGTTTCT^^ 
1 0 TC ATCTC ATCG C AAG AAAACG C A AAT A A 

MALPTIAIVGRPhTVGKSTLFNRIAGERlSIVEDVEGVTRDRIYATGEWLNRSFSMIDTGGIDDVDAPFMEOIKHQAEIAM 
EEADVTVFVVSGKEGITDADEYVARKLYKTHKPVILAVNKVDNPEMRNDIYDFYALGLGEPLPISSVHGIGTGDVLDAI 
VENLPNEYEEENPDVIKFSLIGRPNVGKSSLINAILGEDRVIASPVAGTTRDAIDTHFTDTDGQEFTMIDTAGMRKSGKV 
15 YEhTTEKYSVMRAMRAIDRSDVVLMVINAEEGII^YDKRlAGFAHEAGKGMIIVVNKWDTLEKDNHTMKNWEEDIREQ 
FQYLPYAPIIFVSALTKQRLHKLPEMIKQISESQOTRIPSAVLNDVIMDAIAINPTPTDKGKRLKIFYATQVATKPPTFVIFV 
NEEELMHFSYLRFLENQIRKAFVFEGTPIHLIARKRKZ 

ID37 714bp 

20 

ATGACAGAAACC ATTAA ATTGATGAAGGCTCATACTTCAGTGCGCAGGTTTAAAGAGCAAGAAATTCCCCAAGTA 
GACITAAATGAGATTTTGACAGCAGCCCAGATGGCATCATCrTGGAAGAATTTCCAATCCTACTCTGTGATTGTG 
TAC GAAG TCAAGAGAAGAAAGATGCCTTGTATGAATTGGTACCTCAAGAAGCCATTCGCCAGTCTGCTGTTTTCCT 
TCTCTTTGTCGGAGATTTGAACCGAGCAGAAAAGGGAGCCCGACTTCATACCGACACCTTCCAACCCCAAGGTGT 

25 GGAAGGTCTCTTGATTAGTTCGGTCGATGCAGCTCTTGCTGGACAAAACGCCTTGTTGGCAGCTGAAA 

TATGGTGGT GTGAT TATCGGTTTGGTTCGATACAAGTCTGAAGAAGTGGCAGAGCrCTTTAACCTACCTGACTACA 
CCTATTC TGTC TTTGGGATGGCACTGGGTGTGCCAAATCAACATCATGATATGAAACCGAGACTGCCACTAGAGA 
ATGTTGTCTTTGAGGAAGAATACCAAGAACAGTCAACTGAGGCAATCCAAGCTTATGACCGTGTTCAGGCTGACT 
ATGCTGGGGCGCGTGCGACCACAAGCTGGAGTCAGCGCCTAGCAGAACAGTTTGGTCAAGCTGAACCAAGCTCAA 

30 CTAGAAAAAATCTTGAACAGAAGAAATTATTGTAG 

MTETIKLMKAHTSVRRFKEQEIPQVDLNEILTAAQMASSWKNFQSYSVIVVRSQEKKDALYELVPOEAIROSAVFLLFV 
GDLNRAEKGARLHTDTFQPQGVEGLLISSVDAALAGQNALLAAESLGYGGVHGLVRYKSEEVAELFNLPDYTYSVFG 
MALGVPNQHHDMKPRLPLENVVFEEEYQEQSTEAIQAYDRVQADYAGARATTSWSQRLAEQFGQAEPSSTRKNLEQK 
35 JCLLZ 



ID38 729bp 

ATGACAGAAATTAGACTAGAGCACGTCAGTTATGCCTATGGTCAGGAGAGGATTTTAGAGGATATCAACCTACAG 
,40 GTG ACTT CAGGCGAAGTGGTTTCCATCCTAGGCCCAAGTGGTGTTGGAAAGACCACCCTCTTTAATCTAATCGCTG 
GGATTTTAGAAGTTCAGTCAGGGAGAATTGTCCTTGATGGTGAAGAAAATCCCAAGGGGCGCGTGAGTTATATGT 
TGCAAAAGGATCTGCTCTTGGAGCACAAGACGGTGCTTGGAAATATCATTCTGCCCCTCTTGATTCAAAAGGTGG 
ATAAGGCAGAAGCTATTTCCGGAGCGGATAAAATTC1T"GCGACCTTCCAGCTGACAGCTGTAAGAGACAAGTATC 
CTCATGAACTTA GCGG TGGGATGCGCCAGCGTGTAGCCTTACTCCGGACCTACCTTTTTGGGCACAAGCTCTT^ 
45 CTTAGATGAGGCCTTTAGCGCCTTGGATGAGATGACAAAGATGGAACTCCACGCnTGGTATCITGAGATTCACAA 
GCAGTTGCAGCTAACAACCCTGATCATCACGCATAGTATTGAGGAGGCCCTCAATCTCAGCGACCGTATCTATATC 
TTGAAAAATCGCCCTGGGCAGATTGTTTCAGAAATTAAACTAGATTGGTCTGAAGATGAGGACAAGGAAGTCCAA 
AAGATTGCCTACAAACGTCAAATTTTGGCGGAATTAGGCTTAGATAAGTAG 

50 MTEIRLEHVSYAYGQERILEDINLQVTSGEVVSILGPSGVGKTTLFNLIAGILEVQSGRIVLDGEENPKGRVSYMLOKDLL 
LEHKTVLGNHLPLLIQKVDKAEAISRADKILATFQLTAVRDKYPHELSGGMRQRVALLRTYLFGHKLFLLDEAFSALDE 
MTKMELHAWYLEIHKQLQLTTLIITHSIEEALNLSDRIYILKNRPGQIVSEIKLDWSEDEDKEVQKIAYKROILAELGLDK 



55 ID39 2433bo 

ATG A A CT ATTC A AA AG C ATTG AATG A ATGT ATCG A A AGTG C CT A C ATG GTTG CTGG A C ATTTTG G AG CTCGTT ATC 
TAGAGTCGTGGCACTTGTTGATTGCCATGTCTAATCACAGTTATAGTGTAGCAGGGGCAACTTTAAATGATTATCC 
, A GTATGAGATGGACCGTTTAGAAGAGGTGGCTTTGGAACTGACTGAAACGGACTATAGCCAGGATGAAACCTTTAC 
OU GGAATTGCCGTTCTCCCGTCGTTTGCAGGTTCTTTTTGATGAAGCAGAGTATGTAGCGTCAGTGGTCCATGCTAAG 
GTACT AGGGA CAGAGCACGTCCTCTATGCGATTTTGCATGATAGCAATGCCTTGGCGACTCGTATCTTGGAGAGG 
GCTGGi i i I i^l lATGAAGACAAGA AAGA TCAGGTCAAGATTGCTGCTCTTCGTCGAAATTTAGAAGAACGGGCA 
GGCTGGACTCGTGAAGATCTCAAGGCTTTACGCCAACGCCATCGTACAGTAGCTGACAAGCAAAATTCTATGGCC 
AATATGATGGGCATGCCGCAGACTCCTAGTGGTGGTCTCGAGGATTATACGCATGATTTGACAGAGCAAGCGCGT 
OD TCTGGCAAGTTAGAACCAGTCATCGGTCGGGACAAGGAAATCTCACGTATGATTCAAATCTTGAGCCGGAAGACT 
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AAGAACAACCCTGTCrrGGTTGGGGATGCTGGTGTCGGGAAAACAGCTCTGGCGCTTGGTCTTGCCCAGCGTATTG 
CTAGTGGTGACGTGCCTGCGGAAATGGCTAAGATGCGCGTGTTAGAACITGATTTGATGAATGTCGTTGCAGGGA 
C ACGC TTCCGTGGTGACTTTGAAGAACGCATGAATAATATCATCAAGGATATTGAAGAAGATGGCCAAGTCATCC 
TCTTTATCGATGAACTCCACACCATCATGGGTTCTGGTAGCGGGATTGATTCGACTCTGGATGCGGCCAATATCTT 
5 GAAACCAGCCTTG GCGC GTGGAACTTTGAGAACGGTTGGTGCCACTACTCAGGAAGAATATCAAAAACATATCGA 
AAAAGATGCGGCACTTTCTCGTCGTTTCGCTAAAGTGACGATTGAAGAACCAAGTGTGGCAGATAGTATGACTAT 
TTTACAAGGTTTGAAGGCGACTTATGAGAAACATCACCGTGTACAAATCACAGATGAAGCGGTTGAAACAGCGGT 
TAAGATGGCTCATCGTTATTTAACCAGTCGTCACTTGCC^ 

ACAGTGCAAAATAAGGCAAAGCATGTAAAAGCAGACGATTCAGATTTGAGTCCAGCTGACAAGGCCCTGATGGAT 
10 GGCAAGT GGAAA CAGGCAGCCCAGCTAATCGCAAAAGAAGAGGAAGTACCTGTCTACAAAGACTTGGTGACAGA 
GTCTGATATTTTGACCACCTTGAGTCGCTTGTCAGGAATCCCAGTTCAAAAACTGACTCAAACGGATGCTAAGAAG 
TATTTAAATCTTGAAGCAGAACTCCATAAACGGGTTATCGGTCAAGATCAAGCTGTTTCAAGCATTAGCCGTGCCA 
TTCGCCGCAACCAGTCAGGGATTCGCAGTCATAAGCGTCCGATTGGTTCCTTTATGTTCCTAGGGCCTACAGGTGT 
CGGGAAAACTGAATTAGCCAAGGCTCTGGCAGAAGTTCTTTTTGACGACGAATCAGCCCT^ 
15 AGTGAGTATATGGAGAAATTTGCAGCTAGTCGTCTCAACGGAGCTCCTCCAGGCTATGTAGGATATGAAGAAGGT 
GGGG AGTTG ACAGAGAAGGTTCGCAATAAACCCTATTCCGTTCTCCTCTTTGATGAGGTAGAGAAGGCCCACCCA 
GATATCTTTAATGTTCTCTTGCAGGTTCTGGATGACGGTGTCTTGACAGATAGCAAGGGACGCAAGGTCGATTTTT 
CAAATACCATTA TCAT TATGACATCGAATCTAGGTGCGACTGCCCTTCGTGATGATAAGACTGTTGGTTTTGGGGC 
TAAGGATATTCGTTTTGACCAGGAAAATATGGAAAAACGCATGTTTGAAGAACTGAAAAAAGCTTATAGACCGGA 
20 ATTCATCAAC CGTA TTGATGAGAAGGTGGTCTTCCATAGCCTATCTAGTGATCATATGCAGGAAGTGGTGAAGATT 
ATGGTCAAGCCITTAGTGGCAAGTTTGACTGAAAAAGGCATTGACTTGAAATTACAAGCTTCAGCT 
TAGCAAATCAAGGATATGACCCAGAGATGGGAGCTCGCCCACTTCGCAGAACCCTGCAAACAGAAGTGGAGGAC 
AAGTTGGCAGAACTTCTTCTCAAGGGAGATTTAGTGGCAGGCAGCACACTTAAGATTGGTGTCAAAGCAGGCCAG 
TT AAAATTTG ATATTG C ATAA 

25 

mnyskalneciesaymvaghfgaryleswhlliamsnhsysvagatlndypyemdrleevaleltetdysqdetfte 
lpfsrrlqvlfdeaeyvasvvhakvlgtehvlyailhdsnalatrile 

edlkalrqrhrtvadkqnsmanmmgmpqtpsggledythdlteqai^gklepvigrdkeisrmiqilsrktknnpvlv 

GDAGVGKTALALGLAQRIASGDVPAEMAKMRVLELDLMNVVAGTRFRGDFEERMNNIIKDIEEIXjQVILFIDELHTIM 

JO gsgsgidstldaanilkpalargtlrtvgattqeeyqkhiekdaai^rrfakvtieepsvadsmtilqglkatyekhhrv 
qrtoeavetavkmahryltsrhlpdsaidlldeaaatvqnkakhvk^ 

pvykdlvtesdilttlsrlsgipvqkltqtdakkylnleaelhkrvigqdqavssisrairrnqsgirshkrpigsfmflgp 

tgvgktelakalaevlfddesalirfdmseymekfaasrlngappgyvgyeeggeltekvrnkpysvllfdevekahp 

dif^llqvlddgvltdskgrkvdfsntiiimtsnlgatalrddktvgfgakdirfdqenmekrmfeelkkayrpern 

RIdekvvfhslssdhmqevvkimvkplvasltekgidlkloasalkllanqgydpemgarplrrtlqtevedklaell 

lkgdlvagstlkigvkagqlkfdiaz 

ID40 IQOShp 

40 atgaagaaaacatggaaagtgtttttaacgcttgtaacagctcttgtagctgttgtgcttg 

gaactgcttct aaag acaacaaagaggcagaacttaagaaggttgactttatcctagactggacaccaaatacca 
accacacagggctttatgttgccaaggaaaaaggttatttcaaagaagctggagtggatgttgatttgaaattgc 
caccagaagaaagttcttctgacttggttatcaacggaaaggcaccatttgcagtgtatttccaagactacatggc 
taagaaattggaaaaaggagcaggaatcactgccgttgcagctattgttgaacacaatacatcaggaatcatctc 

4D tcgtaaatctgataatgtaagcagtccaaaagacttggttggtaagaaatatgggacatggaatgacccaactga 
acttgctatgttgaaaaccttggtagaatctcaaggtggagactttgagaaggttgaaaaagtaccaaataacga 
ctcaaactcaatcacaccgattgccaatggcgtctttgatactc 

gctaaatctcaaggtgtagatgctaacttcatgtacttgaaagactatgtcaaggagtttgactactattcaccag 
t^atcatcgcaaacaacgactatctgaaagataacaaagaagaagctcgcaaagtcatccaagccatcaaaaaa 
oo ggctaccaatatgccatggaacatccagaagaagctgcagatattctcatcaagaatgcacctgaactcaaggaa 
aaacgtgactttgtcatcgaat ctca aaaatacttgtcaaaagaatacgcaagcgacaaggaaaaatggggtcaa 

tttgacgcagctcgctggaatgctttctacaaatgggataaagaaaatggtatccttaaagaagacttgacagac 
aaaggcttcaccaacgaatttgtgaaataa 

55 mkktwkvfltlvtalvavvlvacgqgtaskdnkeaelkkvdfildwtpntnhtglyvakekgyfkeagvdvdlklp 

peesssdlvingkapfavyfqdymakklekgagitavaaivehntsgiisrksdnvsspkdlvgkkygtwndptelaml 

ktlvesqggdfekvekvpnndsnsitpiangvfdtawiyygwdgilaksqgvdanfmylkdyvkefdyyspviiannd 

ylkdnkeearkviqaikkgyqyamehpeeaadiliknapelkekrdfviesqkylskeyasdkekwgofdaarwnafy 
kwdkengilkedltdkgftnefvkz 



60 



ID41 762bo 



ttgatgagaa actt gagaagtatactgagacgacacattagtctattc 

AGTTAGCAGGTTTTCITAAACTTCTCCCCAAGTTTATCCTGCCGACACCTCnTGAAATTCT 
CO GACAGAGAATTTCTCTGGCACCATAGCTGGGCGACCTTGAGAGTGGCTTTACTGGGGCTGATTTTGGGAGTTTTG^ 
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TTGCCTGTCTTATGGCTGTGCTCATGGATAGTTTGACTTGGCT 

CAGACCATTCCGACCATTGCCATAGCTCCTATCCTGGTCTTGTGGCTAGGTTATGGGATTTTGCCCAAGATTGTCT 
TGATTATCTTAACGACAACCTTTCCCATCATCGTTAGTATTTTC 

GACCTTGTTTAGTCTGATGCGGGCCAAGCCTTGGCAAATCCTGTGGCATTTTAAAATCCCAGTTAGCCT 
5 TTTTATGCAGGTCTGAGGGTCAGTGTCTCCTACGCCTTTATCACAACTGTGGTATCT^ 

AAGGTCTTGGTGTTTATATGATTCAGTCTAAAAAACTGTTTCAGTATGATACCATGTTTGCCATTATTATTCT 
TCGATTATCAGTCTTTTGGGTATGAAGCTGGTCGATATCAGTGAAAAATATGTGATTAAATGGAAACGTTCGTAG 

MMRNIJ^ILRRHISLLGFLGVL^IWQLAGFLKLLPKFILPTPLEILQP 
10 AVLMDSLTWLNDUYPMMVVIQTIPTIAIAPILVLWLGYGILPKW^ 

WQILWHFXIPVSLPYFYAGLRVSVSYAFTTTVVSEWLGGreGLGVYMIOSKKlJQYDTMFAIIILVSIISLLGMKLVDISE 
KYVIKWKRSZ 

ID42 372bp 

15 

TTGATTTTTAATCCTATTTGCTGTATGATAAGGGAAAAGAAAGGGGACAGAGATATGGCI'I'rrACCAATACCCACA 
TGCGATCTGCTAGTTTTGGTATTGTTACCAGCTTGCCTGATGACATCATTGACTCTTTTTGGTA 
TTCTTAAAAAATGTCTrrGAATTGGAAGAAGAACTCGAGTTTCAATTGCTTAATAACCA 
ACrTTTCAAGTCAACACCTCCCTACAGCCATTGATTTTGACTTTAACCATCCTTTCGACC 
20 GTACTGGTTTTAGACATGGACGGTAGAGAAACTATCCTCCTCCCAGAAGAAAATGACCTATTTTAA 

MIFNP1CCMIREKKGDRDMAFTOTHMRSASFGIVTSLPDDIIDSFWYIIDHFLKNVFELEEELEFQLLNNOGKITFHFSSQ 
HLPTAIDFDFNHPFDPRYPPRVLVLDMDGRETILLPEENDLFZ 

25 ID43 1569bp 

ACAGCGGT GTCA TTCTATCTATTTTAAGAAAAGTAATAATCAATTGTr 
ATGAAATATTTTGTTCCTAATGAGGTATTCAGTATTCGTAAATTAAAGGTG 

TTTCAATTTTGGGAAGCCAAGGTATTTTATCGGATGAAGTTGTTACTAGTTCTTCACCGATGGCTACAAAAGAGTC 
30 TTCTAATGCAATTACTAATGATTTAGATAATTCACCAACTGTTAATCAGAATCGTTCTGCTGAAATGATTGCCT 
AATTCAACCACTAATGGTTTAGATAATTCGTTAAGTGTTAATAGCATCAGCTCTAATGGTACTATTCGT^ 
CACAATTAGACAACAGAACAGTTGAATCTACAGTAACATCTACTAATGAAAATAAGAGTTATAAGGAAGATGTTA 
TAAGTGACAGAATTATCAAAAAAGAATTTGAAGATACTGCnTTAAGTGTAAAAGATTATGGTGCAGTAGGTGATG 
G GATT CATGATGATCGACAAGCAATTCAAGATGCAATAGATGCTGCAGCTCAAGGGCTAGGTGGAGGAAATGTAT 
35 ATTTTCCTG A AG G A ACTT ATTT AGT A A A AG A AATTG 1"! ' I"l ' 1 " 1 ' 1 ' A A A AAGTC AT AC A C A CTT A G A ATTG A ATG AG AA 

AG CT AC AATTCT AAATGGT AT AA AT ATT A AG AATC ACC CTTC C ATTGTTTTT ATG AC AG GTTT ATTT ACGG ATG AT 
GGTGCGCAAGTAGAATGGGGCCCAACAGAAGATATTAGTTATTCTGGTGGTACGATTGATATGAACGGTGCTTTG 
AATGAAGAAGGAACTAAAGCAAAAAATCTACCACTTATAAATTCTTCAGGTGCA^^ 

AACGT A ACT AT AAA A A ATGTAAC ATT C AAGG AT AGTT ATC AAG GG C ATG CT ATTC AA ATTG C AGGTTCG A A AAAT 
40 GT ATT AGTTGATAATTCTCGTTTTCTTGGGCAAGCCTTACCCAAAACGATGAAGGATGGGCAAATCATAAGT AAGG 

AG AG C ATTC AG ATTG A A CC AT T A ACT AG AA A AG GTTTTCCTT ATG CCTTG A ATG ATG ATG G G AAA AA ATCTG AAA 
ATGTGACTATTCAAAATTCCTATTTTGGCAAAAGTGATAAATCTGGGGAATTAGTAACAGCAATTGGCACACACTA 
TC AA A C ATT GTCG AC A C A G A A CCC CTCT A A T ATT A A A ATTC A AAAT AATC ATTTTG AT AA C ATG ATGT ATG C AG GT 
GTACGTTTTACAGGATTCACTGATGTATTAATCAAAGGAAATCGCTTTGATAAGAAAGTTAAAGGAGAGAGTGTA 
45 CATTATCGAGAAAGCGGAGCAGCTTTAGTAAATGCTTATAGCTATAAAAACACTAAAGACCTATTAGATTTAAAT 
AAACAGGTGGTTATCGCCGA AAAT AT ATTT AATATTGCCGATCCTAAAACAAAAGCGATACGAGTTGCAAAAGAT 
AGTGCAGAATGTTTAGGAAAAGTATCAGATATTACTGTAACAAAAAATGTAATTAATAATAATTCTAAGGAAACA 
GAACAACCAAATATTGAATTATTACGAGTTAGTGATAATTTAGTAGTCTCAGAGAATAGT 

50 QRCHSIYFKKSNNQLLKIVKKLEVLMKYFVPNEVFSIRKLKVGTCSVLLAISILGSQGILSDEVVTSSSPMATKESSNAITN 
DLDNSPTVNQNRSAEM1ASNSTTNGLDNSLSVNSISSNGTIRSNSQLDNRTVESTVTSTNENKSYKEDVISDRIIKKEFEDT 
ALSVKDYGAVGDGIHDDRQAIODAIDAAAQGLGGGNVYFPEGTYLVKEIVFLKSHTHLELNEKATILNGINIKNHPSIVF 
MTGLFTDDGAQVEWGPTEDISYSGGTIDMNGALNEEGTKAKNLPLINSSGAFAIGNSNNVTIKNVTFKDSYQGHAIQIA 
GSKNVLVDNSRFLGQALPKTMKDGQUSKESIOIEPLTRKGFPYALNDrX3KKSENVTIONSYFGKSDKSGELVTAIGTHY 

55 QTLSTQNPSNIKIQNNHFDNMMYAGVRFTGFTDVLIKGNRFDKKVKGESVHYRESGAALVNAYSYKNTKDLLDLNKQ 
VVIAENIFNIADPKTKAIRVAKDSAECLGKVSDITVTKNVINNNSKETEQPNIELLRVSDNLVVSENS 

ID44 324bp 

60 GTGATGAAAGAAACTCAGCTATTAAAAGGTGTTCTTGAAGGTTGTGTCTTGGATATGATTGGTCAAAAAGAGCGG 
TATGGTTATGAGTTGGTTCAGACTTTGCGAGAGGCTGGATTTGATACTATCGTTCCAGGAACTATTTAT^ 
GCAAAAGTTAGAAAAAAATCAATGGATAAGAGGCGACATGCGCCCGTCGCCAGATGGTCCAGATCGGAAGTATTT 
TTCATTAATGAAAGAAGGAGAAGAGCGTGTCTCAGTCTTTTGGCAACAATGGGACGATTTGAGTCAAAAAGTAGA 
AGGGATTAAGAATGGGGGTTAA 



SUBSTITUTE SHEET (RULE 26) 



WO 00/06738 



PCT/GB99/02452 



52 



MMKETQLLKGVLEGCVLDMIGOKERYGYELVQTLREAGFDTIVPGTIYPLLQKLEKNQWIRGDMRPSPDGPDRKYFSL 
MKEGEERVSVFWQQWDDLSQKVEGIKNGGZ 

ID45 816bp 

ATGAAGAAAATGAAGTATT^GAAGAAACAAGCGCTTTGCTACATGAGTTTTCTGAGGAGAATC 
GAGGAGTTGTGGGAAAGTrTTAATCTTGCrGGATTTCTCTATGATGAAG 

TGATGCTAGATTTCTCAGAAGCAGAACGAGATGGCATGAGTGCAGAGGATTATCTAGGTAAGAATCCTAAAAAAA 
TAATGAAAGAGATTCTCAAGGGAGCACCTCGCAGTTCTATCAAAGAGTCCCTTTTGACGCCAATTCTTGTCCTGGC 
10 GGTATTACGTTATTATCAACTACTAAGTGATTTTTCTAAAGGTCCT 

GGCAACTTCTTA 111 11 CTGATTGGATTTGGACTTGTGGCCACAATTTTACGAAGAAGTTTAGTCCAAGATTCTCCT 

AAAATGAAAATTGGCACTT ACATT GTTGTTGGGACTATAGTTCTTCTAGTTGTTTTAGGATATCT 

GCTTCATACAAGAAGGAGCCTTTTATATTCCGGCTCCCTGGGATAGTTTGTCT 

GGTATTTGGAATTGGAAAGAAGCGGTCTTTCGTCCATTTGTCAGTATGATTATTGCCCATCTTGTGGTGGGTTCTCT 
15 GCTCCGTTATTATGAGTGGATGGGAATTTCAAATGTTTTCCTTACAAAAGTT 

GAATCTTTGTCTTGTTCCGTGGGTTTAAGAAGATAAAATGGAGTGAAGTATAG 

MKKMKYYEETSALLHEFSEENQKYFEELWESFNLAGFLYDEDYLREQIYLMMLDFSEAERDGMSAEDYLGKNPKKIM 
KEILKGAPRSSIKESLLTPILVLAVLRYYQLLSDFSKGPLLTVNLLTFLGQLLIFLIGFGLVATILRRSLVQDSPKMKIGTYI 
20 VVGTIVLLVVLGYVGMASFIQEGAFYIPAPWDSl^VFTISLVIGIWNWCEAVFRPFVSMIIAHLWGSLLRYYEWMGISN 
VFLTKVIPLAVLFIGIFVLFRGFKKIKWSEVZ 

ID46 348bp 

25 CTGl i 1 l i J 1 ATTTATACTCAATGAAAATCAAAGAGCAAACTAGGAAGCTAGCCGCAGGTTGCTCAAAACACTGTT 

TTGAGGTTGTAGA CGAA ACTGACGAAGTCAGCTCAAAACATGTTTTTGAGGTTGTAGATGAAACTGACGAAGTCA 
GCTC AAAAC ACTGTTTTG AGG1 TGT AG ATG AA ACTG ACG AAGTC AGCTC AA AACACTGTTTTG AGGTTGT AG ATG 
AAACTGACGAAGTCAGCTCAAAACATGTTTTTGAGGTTGTAGATGAAACTGACGAAGTCAGTAACCATACATACG 
GTAGGGCGACGCTGACGTGGTTTGAAGAGATTTTCGAAGAGTATTAA 



30 



35 



60 



MFFYLYSMKIKEQTRKLAAGCSKHCFEVVDETDEVSSKHVFEVVDETDEVSSKHCFEVVDETDEVSSKHCFEVVDETD 
EVSSKHVFEVVDETDEVSNHTYGRATLTWFEEIFEEYZ 

ID47 1260bp 



ATGCAGAATCTGAAATTTGCCTTTTCATCTATCATGGCTCACAAGATGCGTTCTTTGCTTACT 

TATCGGTGTTTCATCAGTTGTT GTGATTA TGGCTTTGGGTGATTCCCTATCTCGTCAAGTCAATAAAGATATGACTA 
AATCTCAGAAAAATATTAGCGTCTTTTTCTCTCCTA^ 

TTTTACGGTTTCTGGAAAGGAAGAGGAAGTTCCTGTTGAACCGCCAAAACCGCAAGAATCCTGGGTCCAAGAGGC 
4U AGCTAAACTGAAGGGAGTGGATAGTTACTATGTAACCAATTCAACGAATGCCATCTTGACCTATCAAGATAAAAA 
GGTTGAGAATGCTAATTTGACAGGTGGAAACAGAACTTACATGGACGCTGTTAAGAATGAAATTATTGCAGGTCG 
TAGTCTGAGAGAGCAAGATTTCAAAGAGTTTGCAAGTGTCATTTTGCTAGATGAGG 

GAATCTCCTCAAGAGGCTATTAACAAGGTTGTAGAAGTCAATGGATTTAGTTACCGGGTCATTGGGGTTTATACTA 
GTCCG GAG G CT AA A AG ATC A A AA AT AT ATG GGTTTG GTG G CTTG C CT ATT A CT A CC A AT A TCTCCCTTG CTG CG A A 
4!) TTT^AATGTAGATGAAATAGCTAATATTGTCTTTCGAGTGAATGATACCAGTTTAACCCCAACTCTGGGTCCAGAA 
CTGGCACGAAAAATGACAGAGC TTGCA GGCTTACAACAGGGAGAATACCAGGTGGCAGATGAGTCCGTTGTATTT 
GCAGAAATTCAACAATCGTTTAGTTTTATGACGACGATTATTAGTTCCATC^ 

GAACTGGTGTCATGAAC ATCA TGCTGG TITCGG TGACAGAGCGCACTCGTGAGATTGGTCTTCGTAAGGCTTTGGG 
TGCAACACGTGCCAATATTTTAATTCAGTTTTTGATTGAATCCATGATTTTGACCTTGTTAGGTGGCTTAATTGGCT 
DU TGACAATTGCAAGTGGTTTAACTGCCTTAGCAGGTTTGTTACTGCAAGGTTTAATAGAAGGTATAGAAGTTGGAGT 
ATCAATCCCAGTCGCCCTATTTAGTCTTGCAGTTTCGGCTAGTGTTGGTATGATTITTGGAGTCTTGCCAGCCAAC 
AAGGCATCGAAACTTGATCCAATTGAAGCCCTTCGTTATGAATGA 

MQNLKFAFSSIMAHKMRSLLTMIGIIIGVSSVVVIMALGDSL^ROVNKDMTKSQKNISVFFSPKKSKDGSFTQKQSAFTVS 
DD GKEEEVPVEPPKPQESWVQEAAKLKGVDSYYVTNSTNAILTYQDKKVENANLTGGNRTYMDAVKNEIIAGRSLREQDF 
KEFASVILLDEELSISLFESPQEAINKVVEVNGFSYRVIGVYTSPEAKRSKIYGFGGLPITTNISLAANFNVDEIANIVFRVN 
DTSLTPTLGPEL^RKMTELAGLOOGEYOVADESVVFAEIQOSFSFMTTIISSIAGISLFVGGTGVMNIMLVSVTERTREIG 
LRKALGATRANILIOFLIESM1LTLLGGLIGLTIASGLTALAGLLLQGUEGIEVGVSIPVALFSLAVSASVGMIFGVLPANK 
ASKLDPIEALRYEZ 



ID48 705bp 



CTGATGAAGCAACTAATTAGTCTAAAAAATATCTTCAGAAGTTACCGTAATGGTGACCAAGAACTGCAGGTTCTC 
AAAAATATCAATCTAGAAGTGAATGAGGGTGAATTTGTAGCCATCATGGGACCATCTGGGTCTGGTAAGTCCACT 
03 CTGATGAATACGATTGGCATGTTGGATACACCAACCAGTGGAGAATATTATCTTGAAGGTCAAGAAGTGGCTGGG 



SUBSTITUTE SHEET (RULE 26) 



WO 00/06738 



PCT/GB99/02452 



53 



CTTGGTGAAAAACAACTAGCrAAGGTCCGTAACCAACAAATCGGTTTTGTCTTTCAGCAGTT C ' l 1 I Cll CTATCGA 
AGCTCAATGCTCTGCAAAATGTAGAATTGCCCTTGATTTACGCAGG 

TGAGGAATATTTAGACAAGGTTGAATTGACAGAACGTAGTCACCATTTACCTTCAGAATTATCTGGTGGTCAAAA 
GCAACGTGTAGCCATTGCGCGTGCCTTGGTAAACAATCCTTCTATTATCCTAGCGGATGAACCGACAGGAGCCTTG 
5 GATACCAAAACAGGTAACCAAATTATGCAATTATTGGTTGATTTGAATAAAGAAGGAAAAACCATTATCATGGTA 
ACGCATGAGCCTGAGATTGCTGCCTATGCCAAACGTCAGATTGTCATTCGGGATGGGGTCATTTCGTCTGACAGTG 
CTCAGTTAGGAAAGGAGGAAAACTAA 



MMKOUSLKNIFRSYRNGDQELQVLKNINLEVNEGEFVAIMGPSGSGKSTLMNTIGMLDTPTSGEYYLEGQEVAGLGEK 
10 QLAKVRNQQIGFVFQQFFLLSKLNALQNVELPLIYAGVSSSKRRKLAEEYLDKVELTERSHHLPSELSGGQKQRVAIARA 
LVNNPSIILADEPTGALDTKTGNQIMQLLVDLNKEGKTIIMVTHEPEIAAYAKROIVIRDGVISSDSAQLGICEENZ 

ID49 1200bp 

15 ATGAAGAAAAAGAATGGTAAAGCTAAAAAGTGGCAACTGTATGCAGCAATCGGTGCTGCGAGTGTAGTTGTATTG 
GGTGCTGGGGGGATTTTACTCTTTAGACAACCTTCTCAGACTGCTCTAAAAGATGAGCCTACT 
CCAAGGAAGGAAGCGTGGCCTCCTCTGTTTTATTGTCAGGGACAGTAACAGCAAAAAATGAACAATATGTTTATT 
TTGATGCTAGTAAGGGTGATTTAGATGAAATCCnTGTTTCTGTGGGCGATAAGGTCAGCGAAGGGCAGGCTTTAGT 
CAAGTACAGTAGTTCAGAAGCGCAGGCGGCCTATGATtCAGCTAGTCGAGCAGTAGCTAGGGCAGATCGTCATAT 

20 CAATGAACTCAATCAAGCACGAAATGAAGCCGCTTCAGCTCCGGCTCCACAGTTACCAGCGCCAGTAGGAGGAGA 
AGATGCAACGGTGCAAAGCCCAACTCCAGTGGCTGGAAATTCTGTTGCTTCTATTGACGCTCAATTGGGTGATGCC 
CGTGATGCGCGTGCAGATGCTGCGGCGCAATTAAGCAAGGCTCAAAGTCAATTGGATGCAACAACTGTTCTCAGT 
ACCCTAGAGGGAACTGTGGTCGAAGTCAATAGCAATGTTTCTAAATCTCCAACAGGGGCGAGTCAAGTTATGGTT 
CATATTGTCAGCAATGAAAATTTACAAGTCAAGGGAGAATTGTCTGAGTACAATCTAGCCAACCTTTCTGTAGGTC 

25 AAGAAGTAAGCTTTACirCTAAAGTGTATCCTGATAAAAAATGGACTGGGAAATTAAG 

TAAAAACAATGGTGAAGCAGCTAGTCCAGCAGCCGGGAATAATACAGGTTCTAAATACCCTTATACTATTGATGT 
GACAGGCGAGGTTGGTGATTTGAAACAAGGTTTTTCTGTCAACATTGAGGTTAAAAGCAAAACT^ 
GTTCCTGTTAGCAGTCTAGTAATGGATGATAGTAAAAATTATGTCTGGATTGTGGATGAACAACAAAAGGCTAAA 
AAAGTTGAGGTTTCATTGGGAAATGCTGACGCAGAAAATCAAGAAATCACTTCTGGTTTAACG 

30 GTCATCAGTAATCCAACATCTTCCTTGGAAGAAGGAAAAGAGGTGAAGGCTGATGAAGCAACTAATTAG 

MKKKNGKAKKWQLYAAIGAASVWLGAGGILLFRQPSQTALKDEPTHLVVAKEGSVASSVLl^GTVTAKNEQYVYFD 
ASKGDLDEILVSVGDKVSEGQALVKYSSSEAQAAYDSASRAVARADRHINELNQARNEAASAPAPQLPAPVGGEDATV 
QSPTPVAGNSVASIDAQLGDARDARADAAAQLSKAQSQLDATTVLSTLEGTVVEVNSNVSKSPTGASQVMVHIVSNEN 
35 LQVKGEI^EYNLANl^VGQEVSR-SKVYPDKKWTGKLSYISDYPKNNGEAASPAAGNNTGSKYPYTIDVTGEVGDLKQ 
GFSVNIEVKSKTKAILVPVSSLVMDDSKNYVWIVDEQQKAKKVEVSLGNADAENQEITSGLTNGAKVISNPTSSLEEGKE 
VKADEATNZ 

IPSO 759bp 

40 

ATGTCACGTAAACCATTTATCGCTGGTAACTGGAAAATGAACAAAAATCCAGAAGAAGCTAAAGCATTCGTTGAA 
GCAGTTGCATCAAAACTTCCTTCATCAGATCTTGTTGAAGCAGGTATCGCT 

TTCTTGCTGTTGCAAAAGGC TCAAA CCTTAAAGTTGCTGCTCAAAACTGCTACTTTGAAAATGC^ 
TGGTGAAACTAGCCCACAAGTTTTGAAAGAAATCGGTACTGACTACGTTGTTATCGGTCACTCAGAACGCCGTGA 

45 CTACTTCCATGAAACTGATGAAGATATCAACAAAAAAGCAAAAGCAATCTTTGCGAACGGTATGCTTCCAATCAT 
CTGTTGTGGTGAATCACTTGAAACTTACGAAGCTGGTAAAGCTGCTGAATTCGTAGGTGCTCAAGTATCTGCTGCA 
TTGGCTGGATTGACTGCTGAACAAGTTGCTGCCrCAGTTATCGCTTATGAGCCAATCTGGGCTATCGGTACT 
AATCAGCTTCACAAGACGATGCACAAAAAATGTGTAAAGTTGTTCGTGACGTTGTAGCTGCTGACTTTGGTCAAG 
AAGTCGCAGACAAAGTTCGTGTTCAATACGGTGGTTCTGTTAAACCTGAAAATGTTGCTTCATACATGGCTTGCCC 

50 AGACGTTGACGGTGCCCTTGTAGGTGGTGCGTCACTTGAAGCTGAAAGCTTCTTGGCTTTGCTTGACT 
TAA 



MSRKPFIAGNWKMNKNPEEAKAFVEAVASKLPSSDLVEAGIAAPALDLTTVLAVAKGSNLKVAAQNCYFENAGAFTG 
ETSPQVLKEIGTDYVVIGHSERRDYFHETDEDINKKAKAIFANGMLPIICCGESLETYEAGKAAEFVGAOVSAALAGLTA 
EQVAASVIAYEPIWAIGTGKSASQDDAQKMCKVVRDVVAADFGQEVADKVRVQYGGSVKPENVASYMACPDVDGAL 
VGGASLEAESFLALLDFVKZ 

1D51 !473bn 

TTGAAAACAAAAATTGGATTAGCAAGTATCTGTTTACTAGGCTTGGCAACTAGTCATGTCGCTGCAAATGAAACT 
AAGTAGCAAAAACTTCGCAGGATACAACGACAGCTTCAAGTAGTTCAGAGCAAAATCAGTCTTCTAATAAAACGC 
AAACGAGCGCAGAAGTA CAGACT AATGCTGCTGCCCACTGGGATGGGGATTATTATGTAAAGGATGATGGTTCTA 
AAGCTCAAAGTGAATGGATTTTTGACAACTACTATAAGGCTTC 

GAATGAATGGCATGGAAATTACTACCTGAAATCAGGTGGATATATGGCCCAAAACGAGTGGATCTATGACAGTAA 
TTACAAGAGTTGGTTTTATCTCAAGTCAGATGGGGCTTATGCTCATCAAGAATGGCAATTGATTGGAAATAAGTGG 
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TACTACTTCAAGAAGTGGGGTTACATGGCTAAAAGCCAATGGCAAGGAAGTTATTTCTTGAATGGTCAAGGAGCT 
ATGATGCAAAATGAATGGCTCTATGATCCAGCCTATTCTGCTTATTTTTATCTAAAATCCGATGGAACTTATGCT 
ACCAAGAGTGGCAAAAAGTGGGCGGCAAATGGTACTATTTCAAGAAGTGGGGCTATATGGCTCGGAATGAGTGGC 
AAGGCAACTACTATTTGACTGGAAGTGGTGCCATGGCGACTGACGAAGTGATTATGGATGGTACTCGCTATATCTT 
5 TGCG GCCTCTGGTGAGCTCAAAGAAAAAAAAGATTTGAATGTCGGCTGGGTTCACAGAGATGGTAAGCGCTATTT 
CTTTAATAATAGAGAAGAACAAGTGGGAACCGAACATGCTAAGAAAGTCATTGATATTAGTGAGCACAATGGTCG 
TATCAATGATTGGAAAAAGGTTATTGATGAGAACGAAGTGGATGGTGTCATTGTTCGTCTAGGTTATAGCGGTAA 
AGAAGACAAGGAATTGGCGCATAACATTAAGGAGTTAAACCGTCTGGGAATTCCTTATGGTGTCTATCTCTATAC 
CTATGCTGAAAATGAGACCGTGCTGAGAGTGACGCTAAACAGACCATTGAACTTATAAAGAAATACAATATGAAC 

10 CTGTCTTACCCTATCTATTATGATGTTGAGAATTGGGAATATGTAAATAAGAGCAAGAGAGCTCCAAGTGATACA 
GGCACTTGGGTTAAAATCATCAACAAGTACATGGACACGATGAAGCAGGCGGGTTATCAAAATGTGTATGTCTAT 
AGCTATCGTAGTTTATTACAGACGCGTTTAAAACACCCAGATATTTTAAAACATGTAAACTGGGTAGCGGCCTATA 
CGAATGCTTTAGAATGGGAAAACCCTCATTATTCAGGAAAAAAAGGTTGGCAATATACCTCTTCTGAATACATGA 

^ ^ AAGG AATCCAAGGGCGCGTAGATGTCAGCGTTTGGTATTAA 

MKTKIGLASICLLGLATSHVAANETEVAKTSQDTTTASSSSEQNQSSNKTQTSAEVQTNAAAHWDGDYYVKDDGSKAQ 
SEWIFDNYYKAWFYINSIXiRYSQNEWHGNYYLKSGGYMAONEWIYDSNYKSWFYLKSDGAYAHOEWQUGNKWYY 
FKKWGYMAKSQWQGSYFLNGQGAMMQNEWLYDPAYSAYFYLKSIXjTYANQEWQKVGGKWYYFKKWGYMARNE 

wqgnyyltgsgamatdevimdgtryifaasgelkekkdlnvgwvhrdgkryffnnreeqvgtehakkvidisehng 
20 rindwkkvidenevdgvivrlgysgkedkelahnikelnrlgipygvylytyaenetdaesdakqtielikkynmnlsy 
piyydvenweyvnkskrapsdtgtwvkiinkymdtmkqagyqnvyvysyrsllqtrlkhpdilkhvnwvaaytnal 
ewenphysgkkgwqytsseymkgiqgrvdvsvwyz 

ID52 774bp 

25 

atgaaaaaatttgccaacctttatctgggactggtctttct 
tgcctttaatgctggt gatga tatgaatagctttacaggt^ 

gggag actcatgctgattttggctcagacau iuicimggccttcctatcagccttgatagcgaccattatcggga 
cl i i iggtgccatttacatctaccagtctcgtaagaaataccaagaagcctttctatcactcaataatatcctcat 

30 ggttgcgcctgacgttatgattggtgctagcttcttgattctctttacccaactcaagttttcact^ 
ccgttcratctagtcacgtggccttcrccattcctatcgtggtcttgatggtcttgcctcgact^ 
tggcgacatgattcatgcggcctatgacitgggagctagtcaatttcagatgttcaaggaaa 
ctgactccgtct atcatt actggttatttcatggccttcacctattcgttagatgactttgccgtgacc' i 
aacaggaaatggh l r rcaac c ctat cagtcgagatttactctcgtgctcgcaaggggatttccitagaaatcaat 

35 gccctgtctgctctagtctttctctttagtattatcctagttgtaggttattactttatctct 
gcaagcatga 



MKKFANLYLGLVFLVLYLPIFYLIGYAFNAGDDMNSFTGFSWTHFETMFGDGRLMLILAQTFFLAFLSALIATIIGTFGA 
IYIYQSRKKYQEAFLSLNNILMVAPDVMIGASFLILFTQLKFSLGFLTVLSSHVAFSIPIVVLMVLPRLKEMNGDMIHAAY 
40 DLGASQFQMFKEIMLPYLTPSIITGYFMAFTYSLDDFAVTFFVTGNGFSTLSVEIYSRARKGISLEINALSALVFLFSIILVV 
GYYFISREKEEQAZ 

ID59 1071 bo 

45 ATGAAAAAAATCTATTCATTTTTAGCAGGAATTGCAGCGATTATCCTTC^ 

ATAGTAAAATCAATAGTCGAGATAGTCAAAAATTGGTTATCTATAACTGGGGAGACTATATCGATCCTGAACTCTT 
GACTCAGTTTACAGAAGAAACAGGAATTCAAGTTCAGTACGAGACTTTTGACTCCAACGAAGCCATGTACACTAA 
GATAAAGCAGGGTGGAACGACCTACGATATTGCCATTCCAAGTGAATACATGATTAACAAGATGAAGGACGAAG 
ACCTCTTGGTTCCGCTTGATTATTCAAAAATTGAAGGAATCGAAAATATCGGACCAGAGTTTCTCAACCAGTCCTT 

50 TGACCCAGGTAATAAATTCTCCAT CCCIT ACTTCTGGGGAACCTTAGGAATTGTCTACAACGAAACCATGGTAGAT 
GAAGCGCCTGAGCATTGGGATGACCTTTGGAAGCCGGAGTATAAGAATTCTATCATGCTCTTTGATGGGGCGCGT 
GAGGTGCTGGGACTAGGACTCAATTCCCTCGGCTACAGCCTCAACTCCAAGGATCTGCAGCAGTTGGAAGAGACA 
GTGGATAAGCTCTACAAACTGACTCCAAATATCAAGGCTATCGTTGCGGACGAGATGAAGGGCTATATGATTCAG 
AATAATGTTGCAATCGGCGTGACC TTCTC TGGTGAAGCCAGCCAAATGTTAGAAAAAAATGAAAATCTACGTTAT 

3D GTGGT A C C G AC A G AGG CC AG C AATCTTTG GTTTG A C A AT ATGGTC ATT CCC AAA AC AGTTAAAAACC AAA A CTCA 

GCCTATGCCITTATCAACTTrATGTTGAAACCrGAAAATGCrCTCCAAAATGCGGAG 

CAAACCTACCAGCGAAGGAATTGCTCCCAGAGGAAACAAAGGAAGATAAGGCCTTCTATCCCGATGTTGAAACCA 

TGAAACACCTAGAAGTTTATGAGAAATTTGACCATAAATGGACAGGGAAATATAGCGACCTCTTCCTACAGTTTA 
A AATGT ATCG G A AGT AG 

60 

MKKIYSFLAGIAA1ILVLWGIATHLDSKINSRDSQKLVIYNWGDYIDPELLTQFTEETGIQVQYETFDSNEAMYTKIKQGG 
TTYD1AIPSEYMINKMKDEDLLVPLDYSKIEGIEN1GPEFLNQSFDPGNKFSIPYFWGTLGIVYNETMVDEAPEHWDDLW 
KPEYKNSIMLFDGAREVLGLGLNSLGYSLNSKDLQQLEETVDKLYKLTPNIKAIVADEMKGYMIQNNVAIGVTFSGEAS 
QMLEKNENLRYVVPTEASNLWFDNMVIPKTVKNQNSAYAFINFMLKPENALQNAEYVGYSTPNLPAKELLPEETKED 
OD KAFYPDVETMKHLEVYEKFDHKWTGKYSDLFLQFKMYRKZ 
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ID61 1851bo 

ATGAATAAAAAACTAACAGATTATGTGATTGATCTGGTGGAAATTTTAAATAAACAACAAAAGCAGGTTTTCTGG 
5 GGAATATTTGATATTTTCAGTATGGTGGTTTCCATCATTGTATCrTATATTTTATTTTATGGGCT 
ACCTGTTGACTACATTATCTATACGAGTTTGGCCrrCCTGTTCTATCAATTGATGATTGGTTTT^ 
CGAGCATTAGTCGTTACAGCAAGATTACGGATTTCATGAAAATCll'IUMUGGTGTGACTGCTAGCAGTGTCTTGTC 
AT ATAG T ATCTGTTATGCC 1 '7 C"l "1 GCC ACTCTTCTCCATCCGTTTCATC ATTCTCTTTATCTTGTTG AGTACC l'l C IT 
GATTTTATTGCCACGGATTACTTGGCAGTTAATCTACTCCAGACGCAAAAAAGGTAGTGGTGATGGAGAACACCG 

10 TCGG ACCTTCTTG ATTGGTGCCGGTG ATGGTGGGGCT C I ' l ' l ' l ' l ATGG ATAGTTACCAACATCCAACC AGTG AATTA 

GAACTGGTCGGTATTTTGGATAAGGATTCTAAGAAAAAGGGTCAAAAACTTGGTGGTATTCCTGTTTTGG 
ATGACAATCTGCCTGAATTAGCCAAACGCCATCAAATCGAGCGTGTCATCGTTGCGATTCCGTCGCTGGATCCGTC 
AGAATATGAGCGTATCTTGCAGATGTGTAATAAGCTGGGTGTCAAATGTTACAAGATGCCTAAGGTTGAAACTGT 
TGTTCAGGGCCTTC ACC AAGCAGGT ACTGGCTTCC A A A AAATTG ATATTACGG ACC 1111 GGGTCGTC AGG AAATC 

15 CGTCTTGACGAATCGCGTCTGGGTGCAGAACTGACAGGTAAGACCATCTTAGTCACAGGAGCTGGAGGTTCAATC 
GGTTCTGAAATCTGTCGTCAAGTTAGTCGCTTCAATCCTGAACGCATTGTCITGCTCGGTCATGGGGAAAACT 
TCTACCTTGTTTATCATGAATTGATTCGTAAGTTCCAAGGGATTGATTATGTACCTGTGATTGCGGACATTCAAGA 
CTATGATCGTTTGTTGCAAGTCTTTGAGCAGTACAAACCTGCTATTGTTTATCATGCGGCAGCCCACAAGCATGTT 
CCTATGATGGAGCGCAATCCAAAAGAAGCCTTCAAAAACAATATCCGTGGAACTTACAATGTTGCTAAGGCTGTT 

20 GATGAAGCTAAAGTGTCrAAGATGGTTATGATTTCGACAGATAAGGCAGTCAATCCACCAAATGTTATGGGAGCA 
ACCAAGCGCGTGGCGGAGTTGATTGTCACTGGCTTTAACCAACGTAGCCAATCAACCTACTGTGCAGTTCGTTrrG 
GGAATGTTCTTGGTAGCCGTGGTAGTGTCATTCCAGTCITTGAACGTCAGATTGCTGAAGGTGGGCCTGTAACGGT 
GACAGACTTCCGTATGACCCGTTACTTTATGACCATTCCAGAAGCTAGCCGTCTGGTTATCCATGCTGGTGCTTAT 
GCCAAAGATGGGGAAGTCTTTATCCTTGATATGGGCAAACCAGTCAAGATTTATGACTTGGCCAAGAAGATGGTG 

25 CTTCTAAGTGGCCACACTGAAAGTGAAATTCCAATCGTTGAAGTTGGAATCCGCCCAGGTGAAAAACTCTACGAA 
GAAC TCTTGGTATCAACCGAACTCGTTGATAATCAAGTTATGGATAAGATTTTCGTTGGTAAGGTTAATGTCATGC 
CTTTAG A ATC C ATC AATC AAA AG ATTG G AG AGTTCCG C ACTCTC AGTGG AG ATG AGTTG AAG C AAG CT ATT ATCG 
CCTTTGCTAATCAAACAACCCACATTGAATAA 

30 MNKKLTDYVIDLVEILNKQQKQVFWGIFDFFSMVVSIIVSYILFYGLINP 

SKrrDFMKIFFGVTASSVl^YSICYAFLPLFSIRFIILFILLSTFULLPRITWQLIYSRRKKGSGDGEHRRTFLIGAGDGGALF 
MDSYQHPTSELELVGILDKDSKKKGQKLGGIPVLGSYDNLPELAKRHOIERVIVAIPSLDPSEYERILOMCNKLGVKCYK 
MPKVETVVQGLHQAGTGFQKIDITDLLGRQEIRLDESRLGAELTGKTILVTGAGGSIGSEICROVSRFNPERJVLLGHGEN 
SIYLVYHELIRKFQGIDYVPVIADIQDYDRLLQVFEQYKPAIVYHAAAHKHVPMMERNPKEAFKNNIRGTYNVAKAVD 

35 EAKVSKMVMISTDKAVNPPNVMGATKRVAELIVTGFNORSQSTYCAVRFGNVLGSRGSVIPVFERQIAEGGPVTVTDFR 
MTRYFMTIPEASRLVIHAGAYAKDGEVFILDMGKPVKIYDLAKKMVLLSGHTESEIPIVEVGIRPGEKLYEELLVSTELV 
DNQVMDKIFVGKVNVMPLESINQKIGEFRTLSGDELKQAIIAFANQTTHIE2 

1D101 1338bp 

40 

ATGATTGAACTTTATGATAGTTACAGTCAAGAAAGTCGAGATTTACATGAAAGTCTAGTCGCTACTGGTCTTTCTC 

AACTTGGAGTGGTCATCGAT GCAG ATGGTTTTCTGCCTGATGGTCTGCTTTCTCCTTTTACCT 

G AGGATGGAAAACCTCTCTATTTT AATC AAGTTCCCGTTTCAGATTTTTGGGAAATTTT AGG AGATAAT^ 

CTTGTATTGAAGATGTGACGCAGGAGAGGGCTGTCATTCATTATGCTGATGGAATGCAGGCTCGCTTGGTTAAACA 

45 GGTAGACTGGAAAGACCTAGAAGGTCGAGTACGTCAGGTTGACCACTACAATCGCTTCGGAGCTTGTTTTGCTAC 
AACGACTTATAGCGCAGATAGCGAGCCGATTATGACAGTTTACCAAGATGTCAATGGTCAACAAGTTTTACTGGA 
AAACCATG TGACGG GTGATATCTTATTGACTTrGCCAGGTCAGTCCATGCGTTACTTTGCAAATAAAGTTGAATTT 
ATCACC1 lCriM-lUGCAAGATTTGGAAATAGATACCAGTCAGCTTATCTTTAATACTCTAGCGACT 
TTCCTTCCATCAT CCAGA TAAATCTGGCTCGGATGTCTTGGTATGGCAGGAACCTCTCTATGATGCCATTCCAGGT 

50 AATA TGCA GTTGATTTTGGAAAGTGATAATGTGCGTACTAAGAAGATCATCATTCCAAATAAGGCGACTTATGAG 
CGCGCTTTAGAGTTAACTGACGAGAAATACCATGATCAGTTTGTGCACTTGGGTTATCATTACCAGTTCAAACGTG 
ATAATTTCCTAAGAC GAGA TGCCTTAATCTTGACCAATTCAGATCAGATTGAGCAAGTAGAAGCAATCGCAGGAG 
CCTTGCCTGATOTCACTTTCCGTATTGCAGCGGTGACAGAGATGTCnTCTAAGCT 

AATGTGGCCCTTTACCAGAACGCTAGTCCACAGAAGATTCAGGAGCTGTATCAACTGTCGGATATTTACTTGGATA 
55 TAAACCACAGTAATGA GTTG CTACAGGCAGTGCGTCAGGCCTrTGAGCACAATCTCTTGATTCTTGGCTTT 

GACGGTGCACA ATAGA CTrTTATATCGCTCCAGACCATCTATTTGAAAGTAGTGAAGTTGCTGCriTGGTTGAGACC 
ATTAAATTGGCCCTTTCAGATGTTGATCAAATGCGTCAGGCACTTGGCAAACAAGGCCAACATGCAAATTATGTTG 
ACTTGGTGAGATATCAGGAAACCATGCAAACTGTTTTAGGAGGCTAA 

60 

MIELYDSYSQESRDLHESLVATGLSQLGVVIDAIX3FLPDGLLSPFTYYLGYEDGKPLYFNQVPVSDFWEILGDNQSACIE 
D\n*QERAVIHYADGMQARLVKQVDWKDLEGRVRQVDHYNRFGACFATTTYSADSEPIMTVYQDVNGQQVLLENHV 
TGDILLTLPGQSMRYFANKVEFITFFLODLEIDTSQLIFNTLATPFLVSFHHPDKSGSDVLVWQEPLYDAIPGNMOLILES 
DNVRTKKIIIPNKATYERALELTDEKYHDQFVHLGYHYQFKRDNFLRRDALILTNSDQIEQVEAIAGALPDVTFRIAAVT 
EMSSKLLDMLCYPNVALYQNASPQKIQELYOLSDIYLDINHSNELLQAVRQAFEHNLL1LGFNOTVHNRLYIAPDHLFE 
OD SSEVAALVETIKLALSDVDOMROALGKQGQHANYVDLVRYQETMQTVLGGZ 
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ID102 15I2bp 

ATGACAATTTACAATATAAATTTAGGAATTGGTTGGGCTAGTAGCGGTGTTGAATACGCTCAAGCCTATCGTGCTG 
5 GTG1 1 1 1 1 CGG AAATTAAATCTGTCCTCTAAGTTT ATCTTTACAG AT ATG ATTTTAGCCG AT AAT ATTC AGCACTTA 

ACAGCCAATATTGGTTTTGATGATAATCAGGTTATCTGGCTTTATAATCATTTCACAGATATCAAAATTGCACCTA 
CT AGCGTGACAGTGG ATG ATGTCTTGGCTTACTTTGGTGGTGAAG A AAGTC AC AG AG A AAAAAATGGCAAGGTTT 
* TACGTGTATTCri^"I44GACCAAGATAAGTTTGTAACCTGTTATTTGGTTGATGAGAACAAGGACTTGGTTCAACA 
TGCCGAGTATGTTTTrAAGGGAAACCTGATTCGGAAGGATTACTTTTCTTATACGCGrrATTG 
10 GCTCCCAAGGACAATGTTGCAGTCTTATACCAACGAAC1 I 1 1 1 ATAATGAAGACGGGACTCCAGTCTATGATATCT 

TGATGAATCAAGGGAAGGAAGAAGTTTATCATTTCAAGGATAAGATTTTCTATGGAAAGCAAG Cri T1 GTGCGTG 
CCTTTATGAAATCTTTGAATTTGAATAAGTCTGATTTC 

GTTTGAGGAAGCACAGACAGCACATCTAGCGGTAGTTGTTCATGCGGAGCATTATAGTGAAAATGCTACAAATGA 
GGACTATATCCT^GGAATAACTATTATGACTATCAGTITACCAATGCAGATAAGGTTGACriCiriATCGTGTCT 

15 ACTGATAGACAAAATGAAGTrCTACAAGAGCAATTTGCCAAATATACTCAGCATCAGCCAAAGATTGTTACCATT 
CCTGTAGGCAGTATTGATTCCTTGACAGATTCAAGTCAAGGGCGCAAACCATTTTCATTGATTACGGCTTCACGTC 
TTGCCAAAGAAAAGCACATTGATTGGCTrGTGAAAGCTGTGATTGAAGCTCATAAGGAGTTACCGGAACTAACCT 
TTGATATCTATGGTAGTGGTGGAGAAGATTCTCTGCTTAGAGAAATTATTGCAAATCATCAGGCAGAGGACTATAT 
CCAACTCAAGGGGCATGCGGAACTTTCGCAGATTTATAGCCAGTATGAGGTCTACTTAACGGCTTCTACCAGCGA 

20 AGGATTTGGTCTGACCTTGATGGAAGCTATTGGTTCAGGTCTACCTCTAATTGG 

AGACCTTTATAGAGGATGGGCAAAATGGTTATTTGATTCCAAGTTCATCTGACCATGTAGAAGACCAAATCAAGC 
AAGCTTATGCCGCTAAGATTTGTCAATTGTATCAAGAAAATCGTTTGGAAGCTATGCGTGCCTATTCTTACCAAAT 
TGCAGAAGGCTTCTTGACCAAAGAAATTTTAGAAAAGTGGAAGAAAACAGTAGAGGAGGTGCTCCATGATTGA 

25 MTIYNINLGIGWASSGVEYAQAYRAGVFRKLNLSSKHFTDMIUVDNIOHLTAN1GFDDNOVIWLYNHFTDIKIAPTSVT 
VDDVLAYFGGEESHREKNGKVLRVFFFDQDKFVTCYLVDENKDLVQHAEYVFKGNLIRKDYFSYTRYCSEYFAPKDN 
VAVLYQRTFYNEDGTPVYDILMNQGKEEVYHFKDKIFYGKQAFVRAFMKSLNLNKSDLVILDRETGIGQVVFEEAQTA 
HLAVVVHAEHYSENATNEDYILWNNYYDYQFTNADKVDFFIVSTDRQNEVLQEQFAKYTQHQPKIVTIPVGSIDSLTDS 
SQGRKPFSLITASRLAKEKHIDWLVKAVIEAHKELPELTFDIYGSGGEDSLLREIIANHQAEDYIQLKGHAELSOIYSQYE 

30 VYLTASTSEGFGLTLMEAIGSGLPLIGFDVPYGNQTFIEDGQNGYLIPSSSDHVEDQIKQAYAAKICQLYQENRLEAMRA 
YSYQIAEGFLTKEILEKWKKTVEEVLHD2 



35 

ID103 2292bp 

ATGTCCTCTCTTTCGGATCAAGAATTAGTAGCrAAAACAGTAGAGTTTCGTCAGC^ 
40 TAGACGATATTrrGGTTGAAGCTTTTGCTGTGGTGCGTGAAGCAGATAAGCGGATTTTAGGGATGTTTCCTTATC 
TGTTCAAGTCATGGGAGCTATTGTCATGCACTATGGAAATGTTGCTGAGATGAATACGGGGGAAGGTAAGACCTT 
GACAGCTACCATGCCTGTCTATTTGAACGCTTTTTCAGGAGAAGGAGTGATC 

TCAAAGCGTGATGCCGAGGAAATGGGTCAAGTTTATCGTTTTCTAGGATTGACCATTGGTGTACCATTTACGGAAG 
ATCCAA AGAAG GAGATGAAAGCTGAAGAAAAGAAGCTTATCTATGCTTCGGATATCATCTACACAACCAATAGTA 

45 ATTTAGGTTTTGATTATCTAAATGATAACCTAGCCTCGAATGA 

GATTATTGATGAAATTGATGATATCTTGCTTGATAGTGCACAAACTCCTCTGArrATTGCGGGTTCT 

AGTCT A ATT A CT ATG CG ATC ATTG AT AC A CTTGT AA C AACCTTG GTCG A AG G AG AG G ATT AT ATCTTT AAAG AG G A 

GAAAGAGGAGGTTT GGCTC ACrACTAAGGGGGCCAAGTCrGCrGAGAATTTCCTAGGGATTGATAATTTATACAA 

GGAAGAGCATGCGTCTTTTGCTCGTCATrrGGTTTATGCGATTCGAGCTCATAAGCTCTTTACT 

50 TATATCATTCGTGGAAATGAGATGGTACTGGTTGATAAGGGAACAGGGCGTCTAATGGAAATGACTAAACTTCAA 
GGAGGTCTCCAT CAGGCT ATTGAAGCCAAGGAACATGTCAAATTATCTCCTGAGACGCGGGCTATGGCCTCGATC 
ACCTATCAGAGTCTTTTTAAGATGTTTAATAAGATATCTGGTATGACAGGGACAGGTAAGGTCGCGGAAAAAGAG 
TTTATTGAAACr rACA ATATGTCTGTAGTACGCATTCCAACCAATCGTCCGAGACAACGGATTGACTATCCAGATA 
ATCTATA TATCAC TTTACCTGAAAAAGTGTATGCATCCTTGGAGTACATCAAGCAATACCATGCTAAGGGAAATCC 

55 TTTACTCGTTTTTGTAGGCTCAGTTGAAATGTCTCAACTCTATTCGTCT 

ATGTCCTAAATGCTAATAATGCGGCGCGTGAGGCTCAGATTATCTCCGAGTCAGGTCAGATGGGGGCTGTGACAG 
TGGCTACCrCTATGGCAGGACGTGGTACGGATATCAAGCTTGGTAAAGGAGTCGCAGAGCTTGGGGGCTTGATTG 
TTATTGGGACTGAG CGGATG GAAAGTCAGCGGATCGACCTACAAATTCGTGGCCGTTCTGGTCGTCAGGGAGATC 
CTGGTATGAGTAAATTTTTTGTATCCTTAGAGGATGATGTrATCAAGAAATTTGGTCCATCrTGG 

60 GTACAAAGACTATCAGGTTCAAGATATGACTCAACCGGAAGTATTGAAAGGTCGTAAATACCGGAAACTAGTCGA 
AAAGGCTCAGCATGCCAGTGATAGTGCTGGACGTTCAGCACGTCGTCAGACTCTGGAGTATGCTGAAAGTATGAA 
TATACAACGGGATATAGTCTATAAAGAGAGAAATCGTCTAATAGATGGTTCTCGTGACTTAGAGGATGTTGTTGTG 
G AT ATC ATTG AG AGATAT A C AGAAGAGGTAG CG G CTG ATC A CT ATGCT AGTCGTG A ATT ATTGTTTC ACTTT ATTG 
TGACCAATATTAGTTTTCATGTTAAAGAGGTTCCAGATTATATAGATGTAACTGACAAAACTGCAGTTCGTAGCTT 

.65 T ATG A A G C AGG TG ATTG AT AAAG A A CTTTCTG AA A AG AAAG A ATT A CTT A ATC A AC ATG ACTT AT ATG A AC AGTT 
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TTTACGACTTTCACTGCTTAAAGCCATTGATGACAACTGGGTAGAGCAGGTAGACTATCTACAACAGCTATCCATG 
GCTATCGGTGGTCAATCTGCTAGTCAGAAAAATCCAATCGTAGAGTACTATCAAGAAGCCTACGCGGGCTTTGAA 
GCTATGAAAGAACAGATTCATGCGGATATGGTGCGTAATCTCCTGATGGGGCTGGTTGAGGTCACTCCAAAAGGT 
G A AATCGTG ACTC ATTTTCC AT A A 

5 

MSSl^DQELVAKTVEFRQRI^EGESLDDILVEAFAVVREADKRJLGMFPYDVQVMGAIVMHYGNVAEMNTGEGKTLT 
ATMPVYLNAFSGEGVMV\riTNEYLSKRDAEEMGQVYRFLGLTIGVPFTEDPKKEMKAEEKKLIYASDIIYTTNSNLGF 
DYLNDNLASNEEGKFLRPFNYVIIDEIDDILLDSAQTPLIIAGSPRVQSNYYAIIDTLVTTLVEGEDYIFKEEKEEVWLTTK 
GAKSAENFLGIDNLYKEEHASFARHLVYAIRAHKLFTICDKDYIIRGNEMVLVDKGTGRLMEMTKLQGGLHQAIEAKEH 

10 VKI^PETRAMASITYOSLFKMFNKISGMTGTGKVAEKEFIETYNMSVVRIPTNRPRQRIDYPDNLYITLPEKVYASLEYDC 
QYHAKGNPLLVFVGSVEMSOLYSSlXFREGIAHm^LNANNAAREAQIISESGQMGAVTVATSMAGRGTDIKLGKGVAE 
LGGLIVIGTERMESQRIDLOIRGRSGROGDPGMSKFFVSLEDDVIKKFGPSWVHKKYKDYQVQDMTQPEVLKGRKYRK 
LVEKAQHASDSAGRSARRQTLEYAESMNIQRDIVYKERNRLIIXjSRDLEDVVVDnERYTEEVAADHYASRELLFHFrVT 
NISFHVKEVPDYIDVTDKTAVRSFMKQVIDKEL^EKKELLNOHDLYEQFLRL^LLKAIDDNWVEQVDYLQQLSMAIGG 

1 5 QSASQKNPIVEYYQEA YAGFEAMKEQIHADM VRNLLMGLVEVTPKGEIVTHFPZ 

ID104 879bo 

ATG A A A C AAG AATG GTTTG AA AGT A ATG ATTTTGT A A AA AC A A C A AG C A A G A A C A AG CCTG A AG AG C A AGCTC A 
AGAGGTTGCAGACAAGGCTGAAGAAAGGATACCCGATCTCGATACACCAATTGAAAAAAATACTCAGTTAGAGG 
AGGAAGTCTCTCAAGCTGAAGTCGAATTGGAAAGCCAGCAAGAAGAGAAAATTGAAGCTCCTGAAGACAGTGAA 
GCGAGAACAGAAATAGAAGAAAAGAAGGCATCTAATTCTACTGAAGAAGAGCCAGACCTTTCTAAAGAAACAGA 
AAAAGTCA CTAT AGCTGAAGAGAGCCAAGAAGCTCTTCCTCAGCAAAAAGCAACCACGAAAGAGCCACTTCTTAT 
CAG TAAAT CTTTAGAAAGTCCTTATATCCCCGACCAAGCTCCAAAATCTAGGGATAAATGGAAAGAGCAAGTGCT 
TGA1 1 iUIGG TCITGGCTAGTGG AAGCG ATCAAATCTCCTACAAGTAAGTTGGAAACAAGTATCACACACAGTTAC 
ACAGCCTTTCTCTTGCTCATTCTGTTTTCTGCATCTTCCTTTTTCnTTAGTATCT 
GGACATATAGCAAGCATTAACAGTCGCTTCCCTGAGCAGCTAGCTCCT^ 

AGTAGCGACAACACTC 1" 1 C l'l C1'I'1"1'CATTCCTCTTGGGTAGTTTCGTTGTGAGACGATTTATCCACCAGGAAAAG 
GA CTGGACGCTAGAC AAGGTTCTCCAACAATATAGTCAACTCTTGGCAATTCCAATCTCCTCACTGCTATTGCTAG 
TTTCTTTGCTTTCTTTGATAGCCTACGATTTACAGCCCTCTTGTGTGTGA 

MKQEWFESNDFVKTTSKNKPEEQAQEVADKAEERIPDLDTPIEKNTQLEEEVSQAEVELESQQEEKIEAPEDSEARTEXE 
EKKASNSTEEEPDLSKETEKVTIAEESOEALPQQKATTKEPLLISKSLESPYIPDQAPKSRDKWKEQVLDFWSWLVEAIKS 
PTSKLETSITHSYTAFLLLILFSASSFFFSIYHIKHAYYGHIASINSRFPEQLAPLTLFSIISILVATTLF^SFLLGSFVVRRFIH 
QEKDWTLDKVLQQYSQLLAIPISSLLLLVSLLSLIAYDLQPSCVZ 

ID106 327bp 

ATGTACTTTCCAACATCCTCrGCCITGATTGAATTTCTCATCnTGGCTGTACT 

TG AG ATT AG CCAAACC ATT A AG CTG ATCG CT A AT ATC AA AG AATCC A C ACTCT ATCCC ATTCTC AA AA A ATTGG A 
AG G C A AT AG CTTTCTG A C A A CCT ATTCT AG AG AGTTCC A AGGTCG C ATG CG C A A AT ACT A CTCCTTG AC A AACGG 
TGGTATAGAGCAGCTCTTGACCCTAAAAGATGAATGGGCACTCTATACAGACACCATCAATGGCATCATAGAAGG 
G AGT ATC CG CC ATG A C A AG AA CTG A 

MYFPTSSALfEFLILAVLEQGDSYGYEISQTIKLlANIKESTLYPIlJCKLEGNSFLTTYSREFQGRMRKYYSLTNGGIEOLLT 
LKDEW AL YTDTING HEGS IRHDK NZ 



20 
25 
30 



ID 108 954bp 

ATGG ATTTT GAAAAAATTGAACAAGCTTATATCTATTTACTAGAGAATGTCCAAGTCATCCAAAGTGATTTGGCGA 
CCAACrTTTATGACGCCTTGGTGGAGCAAAATAGCATCTATCTGGATGGTGAAACTGAGCTAAACCAGGTCAAAG 
ACAACAATCAGGCCCTTAAGCGTTTAGCACTACGCAAAGAAGAATGGCTCAAGACCTACCAGTTTCTCTTGATGA 
AGGCTGGGCAAACAGAACCCTTGCAGGCCAATCACCAGTTTACACCGGATGCTATTGCIT^ 

TGTGGAAGAGTTGTTTAAAGAGGAGGAAATTACTATCCTCGAAATGGGTTCTGGGATGGGAATTCTAGGCGCTAT 
i uu 1 GACCTCGCTTACTAAAAAGGTGGATTACTTGGGAATGGAAGTGGATGATTTGCTGATTGATCTGGCAGCT 
AGCATGGCAGATGTAATTGGTTTGCAGGCTGGCTTTGTCCAAGGAGATGCCGTTCGCCCACAAATGCTCAAAGAA 
AGCGATGTGGTCATCAGTGACTTGCCTGTCGGCTATTATCCTGATGATGCCGTTGCGTCGCGCCATCAAGTTGCTT 
CTAGCC^GAACATACTTACGCCCATCACTTGCTCATGGAACAAG 

CTA ill * I CTAGCTCCG AGTG ATTTGTTG ACCAGTCCTCAAAGTGATTTGTTA A AAG A ATGGCTG A AAG AAG AGGC 

GAGTCrGGTTGCTATGATTAGTCrGCCT GAAAA TCTCITTGCrAATGCCAAACAATCTAAGACT^ 

A G AAG A AAA AT G AA AT AG C A GT AG AG CCTTTTGTTT ATCC A CTTG CT AG CTTG C AAG ATG C A AGTGTTTT A ATG AA 

ATTTAAAGAAAATTTTCAAAAATGGACTCAAGGTACTGAAATATAA 

MDFEKIEQAYIYLLENVQVIQSDLATNFYDALVEQNSIYLIX3ETELNQVKDNNQALKRLALRKEEWLKTYQFLLMKA 
GQTEPLQANHOFTPDAIALLLVFIVEELFKEEEITILEMGSGMGILGAIFLTSLTKKVDYLGMEVDDLLIDLAASMADVI 
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GLQAGFVQGDAVRPQMLKESDVVISDLPVGYYPDDAVASIUiQVA^SQEHTYAHHLLMEQGLKYLKSDGYAIFLAPSD 
LLTSPQSDLLKEWUCEEASLVAMISLPENLFANAKQSKTinLQKKNEIAVEPFVYPLASLQDASVLMKFKENFQK^TQG 
TEIZ 

1D110 1902bp 

ATGATTATTTTACAAGCTAATAAAATTG AACGTTL 1 1 1 1GCAGGAGAGGTTCT1T1 CGATAATATCAACCTGCAGG 
TTGATGAACGAGATCGGATTGCTCTTGTTGGGAAAAATGGTGCAGGTAAGTCTACTCTTTTGAAGATTTTAGT^ 
AGAAGAGGAGCCAACTAGCGGAGAAATCAATAAGAAAAAAGATATTTCTCTGTCTTACCTAGCCCAAGATAGCCG 
TTTTGAGTCTGAAAATACCATCTACGATGAAATGCTTCATGTCT 

CGTCAGATGGAGCTGGAGATGGGTGAAAAGTCTGGTGAGGATTTGGATAAACTGATGTCAGATTATGACCGCTTA 

TCTGAGAATTTTCGCCAAGCAGGTGGCTTTACCTATGAAGCTGATATTCGAGCGAT1UMGAATGGATTCAAGTTTG 

ACGAGTCTATGTGGCAGATGAAAATTGCTGAGCTTTCTGGTGGTCAAAATACTCGTTTGGCACT^ 

CCTTGAAAAGCCCAATCTCTTGGTCTTGGACGAGCCAACTAACCACTTGGATATTGAAACCATCGCCT 

GAATTACTTGGTAAACTATAGCGGTGCCCTCATTATCGTCAGCCACGACCGTTATTTCTTGGACAAGGTTGCGACA 

ATTACGCTAGATTTGACCAAGCATTCCTTGGATCGCTATGTGGGGAATTACrCT 

AAAAGCTAGTTACTGAGGCAAAAAACTATGAAAAGCAACAGAAGGAAATCGCTGCTCTGGAAGACTTTGTCAATC 
GCAATCTAGTTCGTGCTTCAACGACTAAACGTGCTCAATCTCGCCGTAAACAACT 

ACAAGCCTGAAGCTGGCAAGAAAGCAGCCAACATGACCTTCCAGTCTGAAAAAACGTCGGGCAATGTTGTTTTGA 

CTGTTGAAAATGCAGCTGTTGGCTATGACGGGGAAGTCTTGTCACAACCTATCAACCrAGATCTTCGTAAGATGAA 

TGCTGTCGCTATCGTTGGTCCAAATGGTATCGGCAAGTCAACCTTTATCAAGTCTATTGTGGACCAGATTCCTTTT 

ATCAAGGGAGAAAAGCGCTTTGGCGCTAATGTTGAGGTTGGTTACTATGACCAAACCCAAAGCAAGCTGACACCA 

AGTAATACGGTGCTGGATGAACTCTGGAATGATTTCAAACTGACACCAGAAGTTGAAATCCGCAACCGTCTTGGA 

GCCTTCC Mil CTC AGG AG ATG ATGTTA AAA AATCAGTCGGC ATGCTATCTGGTGGCG A A AAAGCTCGTTTGCTTT 

TAGCTAAATTGTCTATGGAAAACAATAACTTTTTGATTCTGGATGAG 

GGAAGTGCTAGAAAATGCCTTGATTGACTTTGATGGAACCTTGCTGTTTGTCAGTCATGATCGTTACTT^ 
CGTGTGGCAACTCATGTTTTGGAATTGTCTGAGAATGGTTCAACTCTCTACCTTGGAGATT^ 

AGAAGAAAGCAACAGCAGAAATGAGTCAGACTGAGGAAGCTTCAACTAGCAATCAAGCAAAGGAAGCAAGTCCA 

GTCAATGACTATCAGGCCCAGAAAGAAAGTCAAAAAGAAGTTCGCAAACTCATGCGACAAATCGAAAGTCTAGA 

AGCTGAAATTGAAGAGCTAGAAAGTCAAAGCCAAGCCATTTCTGAACAAATGTTGGAAACAAACGATGCCGACA 

AACTCATGGAATTACAGGCTGAGCTGGACAAAATCAGCCATCGTCAGGAAGAAGCTATGCTTGAGTGGGAAGAAT 

TATCAGAGCAGGTGTAA 

MIILQANKIERSFAGEVLFDNINLQVDERDRIALVGKNGAGKSTLLKILVGEEEPTSGEINKKKDISl^YI^ODSRFESEOT 

IYDEMLHVFNDLRRTERQLRQMELEMGEKSGEDLDKLMSDYDRl^ENFRQAGGFTYEADIRAILNGFKFDESMWQMK 

IAEL^GGQNTTRl^LAKMLLEKPNLLVLDEPTNHLDIETIAWLENYLVNYSGAUIVSHDRYFLDKVATrTLDLTKHSLDR 

YVGNYSRFVELKEQKLVTEAKNYEKQQKEIAALEDFVNRNLVRASTTKRAQSRRKQLEKMERLDKPEAGKKAANMTF 

QSEKTSGNVVLTVENAAVGYDGEVLSQPINLDLRKMNAVAIVGPNGIGKSTFIKSIVDQIPFIKGEKRFGANVEVGYYDQ 

TQSKLTPSNTVLDELWNDFKLTPEVEIRNRLGAFLFSGDDVKKSVGMLSGGEKARLLLAKLSMENNNFLILDEPTNHL 

DIDSKEVLENALIDFDGTLLFVSHDRYFINRVATHVLELSENGSTLYLGDYDYYVEKKATAEMSQTEEASTSNQAKEAS 

PVNDYQAQKESQKEVRKLMRQIESLEAEIEELESQSQAISEQMLETNDADKLMELQAELDKISHRQEEAMLEWEELSEQ 

VZ 

ID111 1179bp 

ATGAATCGCTATGCAGTGCAGTTGATTAGCCGTGGGGCTATCAATAAAATGGGAAATATGCTCTATGATTATGGA 
AATAGTGTCTGGTTGGCTTCTATGGGGACTATAGGACAGACAGTTTTAGGAATGTATCAGATTTCTGAGCTCGTCA 
CATCTATTCTCGTCAATCCCnTTGGCGGAGTTATTTCAGACCGTTTTTCTCGTCGTAAGATTTTAATGACG 
CTTGTTTGTGGGATTCrTTGTCTGGCTATTTCTTTCATAAGGAATGATAGCTGGATGAT^ 

TAACATTGTGCAGGCTATTGCTTTTGCCTTTTCTCGCACAGCCAATAAAGCTATCATAACTGAAGTGGTGGAGAAA 

GATGAO^GTGATCTATAATTCTCGCTTAGAGCTGGTTTTGCAGGTTGTAGGTGTTAGCTCT 

CCTTCTTTTACAGTTTGCAAGTCTCCATATGACGCTACTGCTAGACT 

TGGCTTTCCTTCCAAAAGAGGAAGCAAAAGTTCAAGAGAAAAAGGCnTTTACTGGGAGAGATATTTTTGTAGATA 
TCAAGGA TGG GTTACA CTATATCTGGCAT CAGCA AGAAAi 1 -l'lVIMCCTTTTGCTGGTAGCTTCCAGCGTTAATTT 
CTTTTTTGCAGCTTTTGAATTTCTACTTCCCTTTTCGAATCAGCTT^ 

TAACTATGGGGGCTATTGGTTCCATCATTGGGGCrCTTCTAGCTAGTAAAATTAAAGCTAATATTT 
GATTTTACTGGCTTTGACAGGTGTCGGAGTTTTTATGATGGGAT^ 

ATTTAGTTTGTGAATTGTTTATGAC GATTTT TAATATTCAC \ 1 1 i ' 1' 1 ACTC AAGTAC AA ACCAAGGTTG AG AGCG AA 
TTTCTTG G AAG AGT ACTGAGT AC AATTTTT ACCTT A G CT ATT CT ATTT ATG CCT ATTG C AA AAG G ATTT ATG AC AGT 
CTTGCCA AGTGTCCATCTTT ATTC f 1 1CI 1 GATTATTGGACTTGGAGTTGTAGCCTTATATTTCTTAGCTCTCGGAT 
ATGTTCGAACTCATTTTGAAAAATTGATATAA 

MNRYAVOLISRGAINKMGNMLYDYGNSVWLASMGTIGQTVLGMYQISELVTSILVNPFGGVISDRFSRRKILMTADLV 
CGILCLAISFIRNDSWMIGALIVANIVQAIAFAFSRTANKAIITEVVEKDEIVIYNSRLELVLQVVGVSSPVLSFLVLQFASL 
HMTLLLDSLTFFIAFVLVAFLPKEEAKVQEKKAFTGRDIFVDIKDGLHYIWHQQEIFFLLLVASSVNFFFAAFEFLLPFSN 
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QLYGSEGAYASILTMGAIGSIIGALI^SKIKANIYNLULLALTGVGVFMMGLPLPTFLSFSGNLVCELFMTIFNIHFFTQV 
QTKVESEFLGRVI^IFTLAILFMPIAKGFMTVLPSVHLYSFLIIGLGVVALYFLALGYVRTHFEKUZ 

ID1I3 2466bp 

ATGCAAAATCAATTAAATGAATTAAAACGAAAAATGCTGGAATTTTTCCAGCAAAAACAA^ 

GCTAGACCTGGCAAGAAAGGTTCAAGTACCAAAAAATCTAAAACCTTAGATAAGTCAGCCATTTTCCCAGCTATT 
TTACTGAGTATAAAAGCCTTATTTAACTTACTCTTTC 

CTTTGGGATACGGAGTGGCCTTATTTGACAAGGTTCGGGTGCCTCAGACAGAAGAATTGGTGAATCAGGTCAAGG 
ACATCTCTTCTATTTCAGAGATTACCTATTCGGACGGGACGGTGATTC 

TTCTATCTCATCTGAGCAAATTTCGGAAAATCTGAAGAAGGCTATCATTGCGACAGAAGATGAACACTTTAAAGA 

ACATAAGGGTGTAGTACCCAAGGCGGTGATTCGTGCGACCTTGGGGAAATTTGTAGGTTTGGGTTCCtCTAGTGGG 

GGTTCAACCTTGACCCAGCAACTAATTAAACAGCAGGTGGTTGGGGATGCGCCGACCTTGGCTCGTAAGGCGGCA 

GAGATTGTGGATGCTCTTGCCITGGAACGCGCCATGAATAAAGATGAGATTTTAACGACCTATCTC^ 

CCTTTGGCCGAAATAATAAGGGACAGAATATTGCAGGGGCTCGGCAAGCAGCTGAGGGAATTTTCGGTGTAGATG 

CCAGTCAGTTGACTGTTCCTCAAGCAGCATTTTTAGCAGGACTTCCACAGAGTCCCATTACTTACTCT 

AAATACTGGGGAGTTGAAGAGTGATGAAGACCTAGAAATTGGCITAAGACGGGCTAAGGCAGTTCTTTACAGTAT 

GTATCGTACAGGTGCATTAAGCAAAGACGAGTATTCTCAGTACAAGGATTATGACCTTAAACAGGACTTTTTACC 

ATCGGGCACGGTTACAGGAATTTCACGAGACTATTTATACTTTACAACTTTGGCAGAAGCTCAAGAACGTATGTAT 

GACTATCTAGCTCAGAGAGACAATGTCTCCGCTAAGGAGTTGAAAAATGAGGCAACrCAGAAGTTTTATCGAGAT 

TTGGCAGCCAj\GGAAATTGAAAATGGTGGTTATAAGATTACTACTACCATAGATCAGAAAATTCATTCTGCCATG 

CAAAGTGCGGTTGCTGATTATGGCTATCTTTTAGACGATGGAACAGGTCGTGTAGAAGTAGGGAATGTC^ 

GATAACCAAACAGGTGCTATTCTAGGCTTTGTAGGTGGTCGTAATTATCAAGAAAATCAAAATAATCATGCCTTTG 

ATACCAAACGTTCGCCAGCTTCTACTACCAAGCCCTTGCTGGCCTACGGTATTGCTATTGACCAGGGCTTGATGGG 

AAGTGAAACGATTCTATCTAACTATCCAACAAACTTTGCTAATGGCAATCCGATTATGTATGCTAATAGCAAGGG 

AACAGGAATGATGACCTTGGGAGAAGCTCTGAACTATTCATGGAATATCCCTGCTTACTGGACCTATCGTATGCT 

CGTGAAAAGGGTGTTGATGTC.\AGGGTTATATGGAAAAGATGGGTTACGAGATTCCTGAGTACGGTATTGAGAGC 

TTGCCAATGGGTGGTGGTATTGAAGTCACAGTTGCCCAGCATACCAATGGCTATCAGACCTrTAGCTAATAATGGA 

GTTTATC ATC AG A AG C ATGTG ATTTC A AAG ATTG A AG C AG C AG ATG GT A G AGTGGTGT ATG AGT ATC AG G AT AA A 

CCGGTTCAAGTCTATTCAAAAGCTACTGCGACGATTATGCAGGGATTGCTACGAGAAGTTCTATCCrCTCGTGTGA 

CAAC AA CCTTC AAGTCT AACCTG ACTTCTTT A AATCCT ACTCTG G CT AATG C AG ATTGG ATTGGG AAG A CTGGT AC 

AACCAACCAAGACGAAAATATGTGGCTCATGCTTTCGACACCTAGATTAACCCTAGGTGGCTGGATTGGGCATGA 

TG AT A ATC ATTC ATTGTC A CGT AG A G C AG GTT ATTCT AAT A ACTCT AATT AC ATG G CTC ATCTGGT AA ATG CG ATT 

CAGCAAGCTTCCCCAAGCATTTGGGGGAACGAGCGCTTTGCTTTAGATCCTAGTGTAGTGAAATCGGAAGTCTTG 

AAATCAACAGGTCAAAAACCAGAGAAGGTTTCTGTTGAAGGAAAAGAAGTAGAGGTCACAGGTTCGACTGTTACC 

AGCTATTGGGCTAATAAGTCAGGAGCGCCAGCGACAAGTTATCGCTITGCTATTGGCGGAAGTGATGCGGATTAT 

CAGAATGCTTGGTCrAGTATTGTGGGGAGTCTACCAACTCCATCCAGCTCCAGCAGTTCAAGTAGTAGTTCTAGCG 

ATAGCAGTAACTCAAGTACTACACGACC1 1C1 1 Cl'l CAAGGGCGAGACGATAA 

MQNQLNELKRKMLEITFQQKQKNKKSARPGKXGSSTKKSKTLDKSAIFPAILI^IKALFNLLFVLGFLGGMLGAGIALGY 

GVALFDKVRVPQTEELVNQVKDISSISEITYSDGTVIASIESDLLRTSISSEOISENLICKAHATEDEHFKEHKGVVPKAVIR 

ATLGKFVGLGSSSGGSTLTQQUKQQVVGDAPTLARKAAEIVDALALERAMNKDEILTTYLNVAPFGRNNKGQNIAGA 

RQAAEGIFGVDASQLTVPQAAFLAGLPQSPITYSPYENTGELKSDEDLEIGLRRAKAVLYSMYRTGALSKDEYSOYKDY 

DLKQDFLPSGTVTGISRDYLYFTTLAEAQERMYDYLAORDNVSAKELKNEATQKFYRDLAAKEIENGGYKITTTIDQKI 

HSAMQSAVADYGYLLDDGTGRVEVGNVLMDNQTGAILGFVGGRNYQENQNNHAFDTKRSPASTTKPLLAYGIAIDQG 

LMGSETIl^NYPTNFANGNPIMYANSKGTGMMTLGEALNYSWNIPAYWTYRMLREKGVDVKGYMEKMGYEIPEYGIE 

SLPMGGGIEVTVAQHTNGYQTLANNGVYHOKHVISKIEAADGRVVYEYODKPVQVYSKATATIMQGLLREVLSSRVTT 

TFKSNLTSLNPTLANADWIGKTGTTNODENMWLMLSTPRLTLGGWIGHDDNHSLSRRAGYSNNSNYMAHLVNAIQQA 

SPSIWGNERFALDPSVVKSEVLKSTGQKPEKVSVEGKEVEVTGSTVTSYWANKSGAPATSYRFAIGGSDADYQNAWSSI 

VGSLPTPSSSSSSSSSSSDSSNSSTTRPSSSRARRZ 

IDU4 1974bp 

AT GAAAA AATTTTATGTAAGTCCAATTTTTCCT ATTCT AGT AGGATTGATTGCGTTTGGAGTCTTATCCACTTTCAT 
TATTTTTGTTAATAATAATCTGTTGACGGTTTTAATTTTG 1 i i C I I I 1 I GTAGGAGGCTATG ITl'l ITI ATTTAAGAA 
ACTGAGAGTGCATTATACAAGGAGTGATGTAGAACAGATACAGTATGTAAACCACCAAGCGGAAGAAAGTTTGAC 
AGCTCT ATTGG AACAGATGCCTGTAGGTGTTATGAAATTGAATTTATCTTCTGGAGAGGTTGAGTGGTTTAATCCC 
TATGCTGAATTGATTITGACCAAGGAAGATGGTGATTTTGATTTAGAAGCTGTTCAAACGATTATCAAGGCrTCAG 
TAG G AAA TCCGTCTACTTATGCCAAGCTTGGTG AG A AGCGTTATGCTGTTCATATGG ATG C I 1 C I 1 CCGGTGTTTT 
GTATTTTGTAGATGTATCCAGGGAACAAGCCATAACAGATGAATTGGTAACAAGTAGACCAGTGATTGGGATTGT 
CT CTGT GGATAATTATGATGATTTGGAGGATGAAACrrCTGAGTCAGATATTAGTCAAATCAATAGTTTTGTAGCT 
AATITTATATCAGAGTTTTCAGAAAAACACATGATGTTTTCrCGTCGGGTAAGTATC 

TGACrACACGGTGCrrGAGGGCTTGATGAATGATAAATTTTCTGTTATTGATGCTTTCAGAGAAGAGTCGAAACAG 
AGACAGTTGCCCTTGACCTTAAGTATGGGATTTTCrTATGGCGATGGAAATCATGATGAGATAGGGAAAGTTGCTT 
TGCTCAATTTGAACTTGGCTGAAGTACGTGGTGGCGACCAGGTGGTTGTTAAGGAAAACGACGAAACGAAAAATC 



SUBSTITUTE SHEET (RULE 26) 



WO 00/06738 



PCT/GB99/02452 



60 



CAGTTTATTTTGGTGGTGGGTCTGCTGCITCAATCAAGCGTACACGGACTCGTACGCGCGCTATG ATGA CAGCTAT 

TTCAGATAAGATTCGGAGTGTAGATCAGG'l^'rri'GTAGTCGGTCACAAAAATTTAGACATGGATGCTTTGGGCTCT 

GCTGTAGGTATGCAGTTGTTCGCCAGCAATGTGATTGAAAATAGCTATGCTCTTTATGATGAAGAACAAATGTCTC 

CAGATATTGAACGAGCTGTTTCATTCATAGAAAAAGAAGGAGTTACGAAGTTGTTGTCTGTTAAGGATGCAATGG 

GGATGGTGACCAATCGTTCTTTGTTGATTCTTGTAGACCATTCAAAGACAGCCTTAj\CATTA 

TGATTTATTTACCCAAACCATTGTTATTGACCACCATAGAAGGGATCAGGATTTTCCAGATAATGCGGTTATTACT 

T AT ATCG AAAGTGGTG C AAGT AGTG CC AGTG AGTTGGT AA CGG AATTG ATTC AGTTCC AG AATTCT AAG A A AAAT 

CGTTTGAGTCGTATGCAAGCAAGTGTCITGATGGCTGGTATGATGTTGGATACTAAAAATTTCACCTCGCGAGTAA 

CTAGTCGGACATTTGATGTTGCTAGCTATCTCAGAACGCGCGGAAGTGATAGTATTGCTATCCAGGAAATCGCTGC 

GACAGATTTTGAAGAATATCGTGAGGTCAATGAACTTATTTTACAGGGGCGTAAATTAGGTTCAGATGTACTAATA 

GCAGAGGCTAAGGACATGAAATGCTATGATACAGTTGTTATTAGTAAGGCAGCAGATGCCATGTTAGCCATGTCA 

GGTATTGAAGCGAGTTTTGTTCITGCGAAGAATACACAAGGATTTATCTCTATCTCAGCTCGAAGTCGTAG 

TGAATGTACAACGGATTATGGAAGAGTTAGGCGGTGGAGGCCACTTTAATTTGGCAGCAGCTCAAATTAAAGATG 

T A ACCTTGTC AG A AG C AGGTG AA A A ACTG A C AG A A ATTGT ATT A AATG A AATG A AGG A AAAG G AG A AAG AAG AA 

TGA 

MKKFYVSPIFPILVGLIAFGVL^THIFVNNNLLTV^ 

QMPVGVMKLNLSSGEVEWFNPYAELILTKEDGDFDLEAVQTHKASVGNPSTYAKLGEKRYAVHMDASSGVLYFVDVS 

REQAITDELVTSRPVIGIVSVDNYDDLEDET^ESDISQINSFVANnSEFSEKHMMFSRRVSMDRFYLFTDYTVLEGLMN 

DKFSVIDAFREESKQRQLPLTLSMGFSYGDGNHDEIGKVALLNLNLAEVRGGDQWVKENDETKNPVYFGGGSAASIK 

RTRTRTRAMMTAISDKIRSVDQVFVVGHKNLDMDALGSAVGMQLFASNVIENSYALYDEEQMSPDIERAVSREKEGV 

TKLLSVKDAMGMVTNRSLLILVDHSFO*ALTI^K£FYDLFTQTrVIDHHRRDQDFPDNAVITYIESGASSASELVTELIQFQ 

NSKKNRL^RMQASVLMAGMMLDTKNFTSRVTSRTFDVASYLRTRGSDSIAIQEIAATDFEEYREVNELILQGRKLGSDV 

LIAEAKDMKCYDTVVISKAADAMLAMSGIEASFVLAKOTQGFISISARSRSKLNVQRIMEELGGGGHFNLAAAQIKD^ 

LSEAGEKLTEIVLNEMKEKEKEEZ 

ID IIS 663bp 

ATGAAGTGCTTGTTATGTGGGCAG ACTATG AAG ACTGTTTT AAClll 1 AGTAGTCTCTTACTTCTGAGG AATG ATG 
ACTCTTGTCTTTGTTCAGACTGTGATTCTACTTITGAAAGAATTGGGGAAGAGAACTGTCC 

AGAGTTGTCAACAAAGTGTCAAGATTGTCAACTTTGGTGTAAAGAGGGAGTTGAAGTCAGTCATAGAGCGATTTT 

TACTTACAATCAAGCTATGAAGGATTTITTCAGTCGGTATAAGTTTGATGGAGACTTCCT 

GCTTCATTTTTAAGTGAGGAGTTGAAAAAGTACAAAGAGTATCAATTTGTTGTAATTCCCCT 

ATGCTAATAGAGGATTTAATCAGGTTGAGGGCTTGGTAGAGGCAGCAGGCTTTGAGTATCTGGATTTATTAGAGA 
AAAGAGAAGAGAGAGCCAGTTCTTCTAAAAATCGTTCAGAGCGCTTGGGGACAGAACTTC L 1 UCU I ATTAAAA 
GTGGAGTCACTATTCCTAAAAAAATCCTACTTATAGATGATATCrATACTACAGGAGCAACTATAAATCGTGTTAA 
GAAACTGTTGGAAGAAGCTGGTGCTAAGGATGTAAAAACATTTTCCCTTGTAAGATGA 

MKCLLCGQTMKTVLTFSSLLLLRNDDSCLCSDCDSTFERIGEENCPNCMKTELSTKCQDCQLWCKEGVEVSHRAIFTY 
NQAMKDFFSRYKFDGDFLLRKVFASFLSEELKKYKEYQFVVIPLSPDRYANRGFNQVEGLVEAAGFEYLDLLEKREER 
ASSSKNRSERLGTELPFFIKSGVTIPKKILLIDDIYTTGATINRVKKLLEEAGAKDVKTFSLVR2 

ID116 1299bp 

ATGAAAGTAAATTTAGATTATCTCGGTCGTTTATTTACTGAGAATGAATTAACAGAAGAAGAACGTCAGTTGGCG 

GAGAAACTTCCAGCAATGAGAAAGGAGAAGGGGAAACTTTTCTGTCAACGCTGTAATAGTACTATTCTAGAAGAA 

TGGT ATTT GCCCATCGGTG CTTACT ATTGTCGAGAGTGCTTGCTGATGAAGCGAGTCAGAAGTGATCAAACTTTAT 

A CT ATTTTCCG C AGG AGG ATTTTCC A A AG C A AG ATGTTCTC AAATGGCG CGG C C A ATT AACTCCTTTTC AA G AG AA 

GGTGTC AG AG G GATTGCTTC AAGT AGTAG AC A AGCAA A AG CC A ACCTTAGTTCATGCGGTAAC AGG AG CTGG AAA 

GACAGAAATGATTTATCAAGTAGTGGCTAAAGTGATCAATGCGGGTGGTGCAGTGTGTTTGGCTAGTCCTCGCAT 

AGATGTTT GTTT GGAGCTGTACAAGCGCCTGCAACAGGAl'ITTTCri'GCGGGATAGCrrTGCTACATGGAGAATC^ 

GAACCTTATTTTCGAACACCACTAGTTGTTGCAACAACCCATCAGTTATTGAAGTTTTATCAA 

GAT AGTGGATGAAGTAG ATGCTT TTCCTTATGTTGATAATCCCATGCTTTACCACGCTGTCAAGAATAGTGT AAAG 

GAGAATGGATTGAGAATCTTTTTAACAGCGACTTCGACCAATGAGTTAGATAAAAAGGTCCGTTTAGGAGAACTA 

AAAAGACTGAATTTACCGAGACGGTTTCATGGAAATCCGTTGATTATTCCAAAACCAATTTGGTTATCGGATTTTA 

ATCGCTACT TAGAC AAGAATCGTTTGTCACCAAAGTTAAAGTCCTATATTGAGAAGCAGAGAAAGACAGCTTATC 

CGTTACTCATTTTTGCTTCAGAAATTAAGAAAGGGGAGCAGT^ 

AGAAAATTGGCTTTGTATCTTCTGTAACAGAGGATCGATTAGAGCAAGTACAAGCnTTTCGAGATGGAGAACTGA 
CAATACTTATCAGTACGACAATCTTGGAGCGCGGAGTTACCITCCCTTGTGTGGATG i ' l ' I ' l CGTAGTAGAGGCCAA 
TCATCG TTTGT TTACCAAGTCTAGTTTGATTCAGATTGGTGGACGAGTTGGACGAAGCATGGATAGACCGACAGGA 
GATTTGC I I 1 li'l'l CCATGATGGGTTAAATGCTTCAATCAAGAAGGCG ATTAAGGAAATTCAGATGATGAATAAGG 
AGGCTGGTCTATGA 

MK V NLD YLG RLFTENELTEEERQL AEKLP A M RKEKGKLFCQRCNSTILEEWYLPIG A YYCRECLLMKR VRS DQTL Y YF 
PQEDFPKQDVLKWRGQLTPFQEKVSEGLLQVVDKQKPTLVHAVTGAGKTEMIYQVVAKVINAGGAVCLASPRIDVCL 
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ELYKRLQQDFSCGIALLHGESEPYFRTPLWATTHQLUCFYOAFDLLIVDEVDAFPYVDNPMLYHAVKNSVKENGLRIF 
LTATSTNELDKKVIU-GELKRLNLPRiU=HGNPLUPKPIWl^DFNRYLDKNRI^PKLKSYIEKQRKTAYPLUFASEIKKGE 
QLAEILQEQFPNEKIGFVSSVTEDRLEQVQAFRDGELT1LISTTILERGVTFPCVDVFVVEANHRLFTKSSLIQIGGRVGRS 
MDRPTG DLLFFHDGLN AS IKKAIKEIQM M NKE AG LZ 

IDU7 870bp 

ATGCAAATTCAAAAAAGTTTTAAGGGGCAGTCTCCCTATGGCAAGCTGTATCTAGTGGCAACGCCGATTGGCAAT 
CTAGATGATATGACTTTTCGTGCTATCCAGACCTTGAAAGAAGTGGACT 
10 ACAGGGCTTTTGCTCAAGCATTTTGACATTTCCACC 

ATTCCTGATTTGATTGGTTTCTTGAAAGCAGGGCAAAGTATTGCTCAGGTCTCTGATGCCGGTTTGCCT 
CAGACCCTGGTCATGATTTAGTTAAGGCAGCTATTGAGGAAGAAATTGCAGTTGTGACAGTTCCAGGTGCCTCTGC 
AGGAATTTCTGCCTTGATTGCCAGTGGTTTAGCGCCACAGCCACATATCrrri ACGG ITnTiACCGAGAAAATCA 
GGTCAGCAGAAGCAATTTTTTGGCTTGAAAAAAGATTATCCTGAAACACAGATTTTTTATGAATCACCT 
15 TAGCAGACACGTTGGAAAATATGTTAGAAGTCTACGGTGACCGCTCCGTTGTCTTGGTCAGGGAATTGACCAAAA 
TCTATGAAGAATACCAACGAGGTACTATCTCTGAGTTATTAGAAAGCATTGCTGAAACGCCACTCAAGGGCGAAT 
GTCTTCTCATTGTTGAGGGTGCCAGTCAGGGTGTGGAGGAAAAGGACGAGGAAGACTTGTTCGTAGAAATTCAAA 
CCCGCATCCAGCAAGGTGTGAAGAAAAACCAAGCTATCAAGGAAGTCGCTAAGATTTACCAGTGGAATAAAAGTC 
AGCTCTACGCTGCCTACCACGACTGGGAAGAAAAACAATAA 



20 



25 



60 



MQIQKSFKGQSPYGKLYLVATPIGNLDDMTFRAIQTLKEVDW1AAEDTRNTGLLLKHFDISTKQISFHEHNAKEKIPDLI 
GFLKAGQSIAQVSDAGLPSISDPGHDLVKAAIEEEIAVVTVPGASAGISALIASGLAPQPHIFYGFLPRKSGQQKQFFGLKK 
DYPETQIFYESPHRVADTLENMLEVYGDRSVVLVRELTKIYEEYQRGTISELLESIAETPLKGECLLIVEGASQGVEEKDE 
EDLFVEIQTR1QQGVKKNQAIKEVAKIYQWNKSQLYAAYHDWEEKQZ 

i 

IDU8 345bp 



ATGATAAAGAAAGGAAAGGGCTGTTTTATGGACAAAAAAGAATTATTC^ 

TTATTGGTAACCTTAGCCGATGTGGAAGCCATCAAGAAAAATCTCAAGAGCCTGGTAGAGGAAAATACAGCTCTT 
30 CGCTTGGAAAATAGTAAGTTGCGAGAACGCTTGGGTGAGGTGGAAGCAGATGCTCCTGTCAAGGCCAAGCATGTT 
CGCGAAAGTGTCCGTCGTATTTACCGTGATGGATTTCACGTATGTAATGATTTTTATGGACAACGTCGAGAGCAGG 
ACGAAGAATGTATGTTTTGTGACGAGTTGTTATACAGGGAGTAA 

MIKXGKGCFMDKKELFDALDDFSQQLLVTLADVEAIKKNLKSLVEENTALRLENSKLRERLGEVEADAPVKAKHVRES 
35 VRRIYRDGFHVCNDFYGQRREQDEECMFCDELLYRE2 

IP119 639bp 

ATGT CAAA AGGATTTTTAGTCTCTCTTGAGGGACCAGAGGGAGCAGGCAAGACCAGTGTTTTAGAGGCTCTGCTA 
40 CCAATTTTAG AGGA AAAAGGAGTAGAGGTGTTGACGACCCGTGAACCTGGCGGAGTCTTGATTGGGGAGAAGATT 
CGGGAAGTGATTTTGGATCCAAGTCATACTCAGATGGATGCTAAAACAGAGCTACTTCTCTATATTGCCAGTCGCA 
GACAGCATTTGGTGGAAAAAGTTCTTCCAGCCCTTGAAGCTGGCAAGTTGGTCATCATGGATCGTTTTATCGATAG 
TTCTGTTGCCTATCAGGGATTTGGTCGTGGCTTAGATATTGAAGCCATTGACTGGCTCAATCAGTTTGCGACAGAT 
GGCCrCAAACCCGATTTGACACTCTATTTTGACATCGAGGTGGAAGAAGGGCTGGCTCGTATTGCTGCTAATAGTG 
45 ACCGCGAGGTTAATCGTTTGGATTTGGAAGGGTTGGACTTGCATAAAAAAGTTCGTCAAGGCTACCTTT 

GGATAAAGAGGGAAATCGCATTGTCAAGATTGATGCTAGTCTCCCTTTGGAGCAAGTTGTGGAAACTACCAAGGC 
TGTCTTGTTTGACGGAATGGGCTTGGCCAAATGA 

MSKGFLVSLEGPEGAGKTSVLEALLPILEEKGVEVLTTREPGGVLIGEKIREVILDPSHTQMDAKTELLLYIASRRQHLVE 
3U KVLPALEAGKLVIMDRFIDSSVAYQGFGRGLDIEAIDWLNQFATDGLKPDLTLYFDIEVEEGLARIAANSDREVNRLDL 
EGLDLHKKVRQGYLSLLDKEGNRIVKIDASLPLEQVVETTKAVLFDGMGLAKZ 

ID120 408bp 

55 ATGGT AG A AC A A AG A A A ATC AATT ACC ATG A A A G ATGTTG CTTT AG AAG C AGG AGTT AGTGTTGG A A CTGTTTC A 

CGTGTAATTAATAAAGAAAAAGGCATTAAAGAAGTAACTTTGAAAAAAGTGGAACAAGCGATTAAAACTTTGAAT 
TACATTCCAGATTACTACGCTAGAGGAATGAAAAAAAATCGAACAGAAACGATTGCAATCATTGTACCAAGTATC 
TGGCATCCC 1 1 C ITI'l CAGAj^TTTGCTATGCATGTGGAAAATGAAGTCTATAAGAGAAATAACAAATTACTCTTAT 
GTTCTATCAATGGTACAAATAGAGAGCAAGACTATCTGGAGATGTTGCGTCATAATAAAGTTGATGGAGTGGTTG 
CCATTACCTATAGGCCAATTGAACATTACTTGACGTCAGGAATTCCCTTTGTTAGTATTGACCGCACATACTCAGA 
G ATTG CC ATTCCTTGTGTTTC A 



65 



MVEQRKSITMKDVALEAGVSVGTVSRVINKEKGIKEVTLKKVEQAIICTLNYIPDYYARGMKKNRTETIAIIVPSIWHPFF 
SEFAMHVENEVYKRNNKLLLCSINGTNREQDYLEMLRHNKVDGVVAITYRPIEHYLTSGIPFVSIDRTYSEIAIPCVS 
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ID121 285bp 

ATGAATATATTTAGAACAAAGAATGTTAGTTTAGATAAAACAGAGATGCATAGGCATTTGAAGTTATGGGATTTG 
ATTTTGCTGGGTATCGGAGCCATGGTAGGGACAGGCGTCITTACAATCACAGGTACTGCAGCTGCAACACTTGCTG 
5 GCCCAGCCCTAGTGATTTCAATCGTTATTTCTGCCTTG 

TCGCGAGTACCCGCTACAGGAGGTGCCTATAGTTACCTCTATGCTATCTTAGGAGAATTCCCTGCCTGGTTGGCTG 
GTTGGTTAACCATGATGGAGTTCATGACAGCCATATCAGGCGTAGCTTCGGGTTGGGCAGCTTATTTTAA 

MNiraTKWSLDKTEMHRHUCLWDLIlXGIGAMVCrTGVFnTGTAAATLAGPALVISIVISAL 
1 0 ATGGA YSYLYAILGEFPAWLAGWLTMMEFMTAISGVASGWAAYF 

ID124 1311bp 

ATGAAATCAAGAGTAAAGGAAACGAGTATGGATAAAATTGTGGTTCAAGGTGGCGATAATCGTCTGGTAGGAAGC 
15 GTGACGATCGAGGGAGCA AAAA ATGCAGTCTTACCCTTGTTGGCAGCGACTATTCTAGCAAGTGAAGGAAAGACC 
^T^EZ^* ^* A ATGTTCCG ATTTTGTCGG ATGTCTTT ATT ATG AATC AG GT AGTTGGTGGTTTG AATG CC AAG GTTG 
ACTTTGATGAGGAAGCTCATCTTGTCAAGGTGGATGCTACTGGCGACATCACTGAGGAAGCCCCTTACAAGTATG 
TCAGCAAGATGCGCGCCTCCATCGTTGTATTAGGGCCAATCCTTGCCCGTGTGGGTCATGCCAAGGTATCCATGCC 
AGGTGGTTGTACGATTGGTAGCCGTCCTATTGATCTTCATTTGAAAGGTCTGGAAGCTATGGGGGTTAAGATTAGT 
20 CAGACAGCTGGTTACATCGAAGCCAAGGCAGAACGCTTGCATGGTGCTCATATCTATATGGACnTCCAAGTGTTG 
GTGCAACGCAGAACTTGATGATGGCAGCGACTCTGGCTGATGGGGTGACAGTGATTGAGAATGCTGCGCGTGAGC 
CTGAGATTGTTGACTTAGCCATTCTCOrTAATGAAATC 

CCATTACTGGTGTTGAGAAACTTCATGGTACGACTCACAATGTAGTCCAAGACCGTATCGAAGCAGGAACCTTTAT 
GGTAGCTGCTGCCATGACTGGTGGTGATGTCTTGATTCGAGACGCTGTCTGGGAGCACAACCGTCCCTTGATTGCC 
25 AAGTTACTTGAAATGGGTGTTGAAGTAATTGAAGAAGACGAAGGAATTCGTGTTCGTTCTCAACTAGAAAATCT 
AAAGCTGTTCATGTGAAAACCTTGCCCCACCCAGGATTTCCAACAG 

CAGTTGCAAAAGGCGAATCAACCATGGTGGAGACAGTTTTCGAAAATCGTTTCCAAACCTAGAAGAGATGCGCCG 
CATG GGCTTGCATTCTGAGATTATCCGTGATACAGCTCGTATTGTTGGTGGACAGCCTTTGCAGGGAGCAGAAGTT 
CTTTCAACTGACCTTCGTGCCAGTGCGGCCnTGATTTTGACAGGTTTGGTAGCACAGGGAGAAACTGTGGTCGGTA 
30 AATTGGTTCACTTGGATAGAGGTTACTACGGTTTCCATGAGAAGTTGGCGCAGCTAGGTGCTAAGATTCAGCGGAT 
TGAGGCAAGTGATGAAGATGAATAA 

MKSRVKETSMDKIVVQGGDNRLVGSVT1EGAKNAVLPLLAATILASEGKTVLQNVPILSDVFIMNQVVGGLNAKVDFD 
EEAHLVKVDATGDrTEEAPYKYVSKMRASIVVLGPILARVGHAKVSMPGGCTIGSRPIDLHLKGLEAMGVKISQTAGYIE 
35 AKAERLHGAHIYMDFPSVGATQNLMMAATLADGVTVIENAAREPEIVDLAILLNEMGAKVKGAGTETmTGVEKLHG 
TTHNVVQDRIEAGTFMVAAAMTGGDVLIRDAVWEHNRPL1AKLLEMGVEVIEEDEGIRVRSQLENLKAVHVKTLPHP 
GFPTDMQAQFTALMTVAKGESTMVETVFENRFQHLEEMRRMGLHSEIIRDTARIVGGQPLQGAEVLSTDLRASAALIL 
TGLVAQGETVVGKLVHLDRGYYGFHEKLAQLGAK1QRIEASDEDE2 

40 ID125 llOlbo 

ATGTTATTAGCGTCAACAGTAGCCTTGTCATTTGCCCCAGTATTGGCAACTCAAGCAGAAGAAGTTCTTTGGACTG 
CAC GTAG TGTTGAGCAAATCCAAAACGATTTGACTAAAACGGACAACAAAACAAGTTATACCGTACAGTATGGTG 
ATACTTTG AGCA CCATTGCAGAAGCCTTGGGTGTAGATGTCACAGTGCTTGCGAATCTGAACAAAATCACTAATAT 

45 GGACTTGATTTTCCCAGAAACTGTTTTGACAACGACTGTCAATGAAGCAGAAGAAGTAACAGAAGTTGAAATCCA 
AACACCTCAAGCAGACTCTAGTGA AGAAG TGACAACTGCGACAGCAGATTTGACCACTAATCAAGTGACCGTTGA 
TGATCAAACTGTTCAGGTTGCAGACCTTTCTCAACCAATTGCAGAAGTTACAAAGACAGTGATTGCTTCTGAAGAA 
GTGGCACCATCTACGGGCACTTCTGTCCCAGAGGAGCAAACGACCGAAACAACTCGCCCAGTTGCAGAAGAAGCT 
CCTCAGGAAACGACTCCAGCTGAGAAGCAGGAAACACAAACAAGCCCTCAAGCTGCATCAGCAGTGGAAGCAAC 

50 TACAACAAGTTCAGAAGCAAAAGAAGTAGCATCATCAAATGGAGCTACAGCAGCAGTTTCTACTTATCAACCAGA 
AGAAACGAAAGTAATTTCAACAACTTACGAGGCTCCAGCTGCGCCCGATTATGCTGGACTTGCAGTAGCAAAATC 
TGAAAATGCAGGTCXTCAACCACAAACAGCTGCCTTTAAW 

AGTGGTTATCGTCCAGGAGACAGTGGAGATCACGGAAAAGGTTTGGCTATCGACTTTATGGTACCAGAACGTTCA 
GAATTAGGGGATAAGATTGCGGAATATGCTATTCAAAATATGGCCAGCCGTGGCATTAGTTACATCATCTGGAAA 
55 CAACGTTTCTATGCTCCATTCGATAGCAAATATGGGCCAGCTAACACTTGGAACCCAATGCCAGACCGTGGTAGT 
GTGACAGAAAATCACTATGATCACGTTCACGTTTCAATGAATGGATAA 

MLI^STVAl^FAPVLATQAEEVLWTARSVEQIQNDLTKTD^^ 

IPPETVLTTTVNEAEEVTEVEIQTPQADSSEEVTTATADLTTNQVTVDDQTVQVADl^QPIAEVTKTVIASEEVAPSTGTS 
00 VPEEQTTETTRPVAEEAPOETTPAEKQETQTSPQAASAVEATTTSSEAKEVASSNGATAAVSTYQPEETKVISTTYEAPA 
APDYAGLAVAKSENAGLQPQTAAFKKKLLTCLALHPLVV1VQETVEITEKVWLSTLWYQNVQNZGIRLRNMLFKIWPA 
VALVTSSGNNVSMLHSIANMGQLTLGTQCQTVVVZQKITMITFTFQZMD 
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IDI26 1281bp 

5 TTGTTTAAGAAAAATAAAGACATTCTTAATATTGCATTC 

GAATGGTGGACAGTTATTTCGTTGCTCATTTAGGATTGATAGCTATTTCAGGGGTTTCAGTAG 

CACCATTTATCAGGCGATTTTCATCGCTCTGGGAGCTGCTATTTCCAGTGTTATTTCAAAAAGCATAGG 

GACCAGTCGAAGTTGGCCTATCATGTGACTGAGGCGTTGAAGATTACCTTACT 

TGTCCATCTTCGCTGGGAAAGAGATGATAGGACTTTTGGGGACGGAGAGGGATGTAGCTGAGAGTGGTGGACTGT 
10 ATCTATCTTTGGTAGGCGGATCGATTGTTCTCTTAGGTTTAATGACTAGTCTAGGAGCCTTGATTCGTG 
TAATCCACGTCTGCCTCTCTATGTTAGTT^^ 

TCTGGATATGGGGATAGCTGGTGTTGCTTGGGGGACAATTGTGTCTCGTTTGGTTGGTCTTGTGATTT^ 
AATTAAAACTGCCTTATGGGAAGCCAACTTTTGGTTTAGATAAGGAACTGTTGACCTTGGCTTTACCAGCAG 
AGAGCGACTTATGATG AGGGCTGGAGATGTAGTGATCATTGCCTTGGTCG 1 1 1LI 1 1 1 GGGACGGAGGCAGTTGCT 

15 GGGAATGCAATCGGAGAAGTCTTGACCCAGTTTAACTATATGCCTGCCTTTGGCGTCGCTACGGCAACGGTCATG 
CTGTTGGCCCGAGCAGTTGGAGAGGATGATTGGAAAAGAGTTGCTAGTTTGACT 
TGTTCCTCATGTTGCCCCTGTCCITTAGTATATATGTCTTGGGTGTACCATTAACTCATCTCTATA 
CTAGCGGTGGAGGCTAGTGTTCTAGTGACACTGTTTTCACTACTTGGGACCCCTATGACGACAGGAACAGTCATCT 
ATACGGCAGTCTGGCAGGGATTAGGAAATGCACGCCTCCCTTTTTATGCGACAAGTATAGGAATGTGGTGTATCC 

20 GCATTGGGACAGGATATCTGATGGGGATTGTGCTTGGTTGGGGCTTGCCTGGTATTTGGGCAGGGTCTCT 
TAATGGTTTTCGCTGGTTATTTCTACGCTATCGTTACCAGCGCTATATGAGCTTGA^ 

LFKKNKDILNIALPAMGENFLQMLMGMVDSYLVAHLGLlAISGVSVAGNimYQAIFlALGAAISSVISKSlGQKIX?SKLA 
YHVTEALKITLU^FLLGFLSIFAGKEMIGLLGTERDVAK^ 

25 SNALNILFSSLAIFVLDMGIAGVAWGTIVSRLVGLVILWSQLKLPYGKPTFGLDKELLTLALPAAGERLMMRAGDVVIIA 
LVVSFGTEAVAGNAIGEVLTQFNYMPAFGVATATVMLLARAVGEDDWKRVASLSKQTFWLSLFLMLPL5FSIYVLGVP 
LTHLYTTDSLAVEASVLVTLFSLLGTPMTTGTVIYTAVWQGLGNARLPFYATSIGMWCIRIGTGYLMGIVLGWGLPGIW 
AGSLLDNGFRWLFLRYRYQRYMSLKGZ 

30 FD127 894bp 

GTGGGA AGAAT TATCAGAGCAGGTGTAAAGATGGAACATCTTGGAAAAGTATTTCGTGAATTTCGAACAAGTGGA 
AATTATTCTTTAAAGGAAGCAGCAGGCGAATCCTGCrCTACCTCTCAGTTATCT 

ACC TGGC AGTCTCCCGiriCl^-IGAGATTTTGGATAACATTCATGTAACAATCGAAAATTTCATGGATAAGGCAAG 
35 GAATTTTCATAATCATGAACATGTGTCTATGATGGCACAGATTATCCCACTTTACTATTCAAACGATATTGCAGGT 
TTTC AAAAGCTTCAAAGAGAACAACTTGAAAAGTCTAAGAGTTCGACGACT 

TTTTGCTACAAGGTCTGATTTGTCAAAGAGATGCGAGTTATGATATGAAGCAGGATGATTTGGGTAAGGTAGCAG 
- ATTATCTCTTCAAAACAGAAGAATGGACCATGTATGAGTTGATTCTTTTCGGTAACCTCTATA 

AGACTATGTC ACTCG GATTGGTAGAGAAGTTATGGAGAGGGAGGAATTTTACCAAGAGATTAGTCGCCATAAGAG 
4U ATTAGTGTTGATTTTGGCCCTCAATTGTTACCAGCATTGTTTAGAGCATTCTTCTTTTTATAATC 

AGGCTTATACAGAGAAGATTATTGACAAAGGTATTAAGCTTTATGAGCGTAATGTTTTCCATTATTTAAAAGGTTT 
TGCCTTATATCAAAAAGGACAGTGTAAAGAAGGCTGTAAGCAGATGCAAGAGGCCATGCATATTTTTGATGTGTT 
AGGTCTTCCAGAGCAAGTAGCCTATTATCAGGAACACTACGAAAAATTTGTCAAAAGTTAA 

45 VGRIIRAGVKMEHLGKVFREFRTSGNYSLKEAAGESCSTSQI^RFELGESDLAVSRFFEILDNIHVTIENFMDKARNFHN 
HEHVSMMAQIIPLYYSNDIAGFQKLQREOLEKSKSSTTPLYFELNWILLQGLICORDASYDMKQDDLGKVADYLFKTEE 
WTMYELILFGNLYSFYDVDYVTRIGREVMEREEFYQEISRHKRLVLILALNCYQHCLEHSSFYNANYFEAYTEKIIDKGI 
KLYERNVFHYLKGFALYQKGQCKEGCKQMQEAMHIFDVLGLPEQVAYYQEHYEKFVKSZ 
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TABLE 3 
ID1 1068bp 

5 ATGTCTAACATTCAAAACATGTCCCTGGAGGACATCATGGGAGAGCGCTTTGGTCGCTACTCCAAGTACATTATTC 
^AAGACCGGGCTTTGCCAGATATTCGTGATGGGTTGAAGCCGGTTCAGCGCCGTATTCTTTATTCTATGAATAAGG 
TAGCAATACTTTTGACAAGAGCTACCGTAAGTCGGCCAAGTCAGTCGGGAACATCATGGGGAATTTCCACCCACA 
CGGGGATTCTTCTATCTATGATGCCATGGTTCGTATGTCACAGA^ 

CACGGTAATAACGGTTCTATGGACGGAGATCCTCCTGCGGGTATGCGTTATACTGAGGCACGTTTGTCTGAAATTG 
10 CAGGCTACLTtlTl CAGGA TATC GAGAAAAAGACAGTTCCTI'riGCATGGAACTTTGACGATACGGAGAAAGAAC 

CAACGGTCTTGCCAGCAGCCTTTCCAAACCTCTTGGTCAATGGTTCGACTGGGATTTCGGCTGGTTATGCCACAGA 
CATTCCTCCCCATAATTTAGCTGAGGTCATAGATGCTGCAGTTTACATGATTGACCACCCAACTGCAAAGATTGAT 
AAACTCATGGAATTCTTGCCTGGACCAGACTTCCCTACAGGGGCTATTATTCAGGGTCGTGATGAAATCAAGAAA 
GCTTATGAGACTGGGAAAGGGCGCGTGGTTGTTCGTTCCAAGACTGAAATTGAAAAGCTAAAAGGTGGTAAGGAA 
15 CAAATCGTTATTATTGAGATTCCTTATGAAATCAATAAGGCCAATCTAGTCAAGAAAATCGATGATGTTCGTGTTA 
ATAACAAGGTAGCTGGGATTGCTGAGGTTCGTGATGAGTCTGACCGTGATGGTCTTCGTATCGCTATCGAACTTAA 
GAAAGACGCTAATACTGAGCTTGTTCTCAACTACTTATTTAAGTACACCGACCTACAAATCAACT 
ATGGTGGCGATTGACAATTTCACACCTCGTCAGGTTGGATTGTTCCAATCCTGTCTAGCTATATCGCTCACCGTCG 
AGAAGTGA 

20 

MSNIQNMSLEDIMGERFGRYSKYIIQDRALPDIRDGLKPVQRRILYSMNKDSmTOKSYRKSAKSVGNIMGNFHPHGDS 
SIYDAMVRMSQNWKNREILVEMHGNNGSMDGDPPAAMRYTEARLSEIAGYLLQDIEKKTVPFAWNFDDTEKEPTVLP 
AAFPNLLVNGSTGISAGYATDIPPHNLAEVIDAAVYMIDHPTAKIDKLMEFLPGPDFPTGAIIQGRDEIKKAYETGKGRV 
VVRSKTEIEKLKGGKEQIVIIEIPYEINKANLVKKIDDVRVN^ 
25 YTDLQINYNFNMVAIDNFTPRQVGLFQSCLA1SLTVEK2 

ID12 684bp 

ATGCCGACATTAGAAATAGCACAAAAAAAACTGGAGTTCATTAAGAAGGCAGAj\GAATATTACAATGCCTTGTGT 
30 ACAAATATACAGTTGAGCGGAGATAAACTAAAAGTAATTTCCGTTACTTCTGTTAACCCTGGGGAAGGAAAAACA 
ACTACTTCCATAAATATAGCATGGTCGTTTGCGCGTGCAGGCTATAAAACTCl'IM'IGATCGATGGCGATACTCGAA 
ATTCAGTTATGTTAGGAGTTTTTAAATCTCGTGAAAAAATTACAGGGCT 

TTTATCTCACGGTTTATGTGATACAAATATTGAAAATTTATTTGTAGTTCAuATCGGGATCTGTATCACCAAACCCT 
ACAGCCTTGTTACAAAGTAAAAATTTTAATGATATGATTGAAACATTGCGTAAATATTTTGATTATATC 
35 ATACACCGCCTATTGGAATTGTTATTGATGCGGCAATTATCACTCAAAAGTGTGATGCGTCCATCTTGGTAACAGC 
AA C AG GTG AG G CG A AT A A A CGTG AT ATCC A A A A A G CG AA AC A AC A ATT AAAA C AAAC AGGG AA A CTGTTC CT AG 
GAGTTGTTTTAAATAAATTGGATATCrCGGTTAATAAGTATGGAGTTTACGGTTCCrATGGAAATTATGGTAAAAA 
ATAA 

40 MPTLEIAQKKLEFIKKAEEYYNALCTNIQL^GDKLKVISVTSVNPGEGKTTTSINIAWSFARAGYKTLLIDGDTRNSVML 
GVFKSREKITGLTEFLSGTADLSHGLCDTNIENLFVVQSGSVSPNPTALLQSKNFNDMIETLRKYFDYIIIDTPPIGIVIDAA 
IITQKCDASILVTATGEANKRDIQKAKQOLKQTGKLFLGVVLNKLDISVNKYGVYGSYGNYGKKZ 

ID13 U82bp 

45 

ATGGA GGCAAATATGAAACATCTAAAAACATTTTACAAAAAATGGTTTCAATTATTAGTCG 
TTTTTAGTGGAGCCTTGGGTAGTTTTTCAATAACTCAACTAACT 

TAGTACTATTACACAAACTGCCTATAAGAACGAAAATTCAACAACACAGGCTGTTAACAAAGTAAAAGATGCTGT 
TGTTTCTGTT ATT A CTT ATTCG G CAAACAGACAAAATAG CGT ATTTG G C A ATG ATG AT ACTG A C AC AG ATT CTC AG 

50 CGAATCTCTAGTGAAGGATCTGGAGTTATTTATAAAAAGAATGATAAAGAAGCTTACATCGTCACCAACAATCAC 
GTTA TTAA TGGCGCCAGCAAAGTAGATATTCGATTGTCAGATGGGACTAAAGTACCTGGAGAAATTGTCGGAGCT 
GACACl riCTCTGATATTGCTGTCGTCAAAATCTCrrCAGAAAAAGTGACAACAGTAGCTGAGTTTGGTGATTCTA 
GTAAGTTAACTGTAGGAGAAACTGCTATTGCCATCGGTAGCCCGTTAGGTTCTGAATATGCAAATACTGTCACTCA 
AGGTATCGTATCCAGTCTCAATAGAAATGTATCCTTAAAATCGGAAGATGGACAAGCTATTTCTACAAAAGCCAT 

55 CCAAACTGATACTGCTATTAACCCAGGTAACTCTGGCGGCCCACTGATCAATATTCAAGGGCAGGTTATCGGAAT 
TACCTCAAGTAAAATTGCTACAAATGGAGGAACATCTGTAGAAGGTCITGGTTTCGCAATTCCTGCAAATGATGCT 
ATCAATATTATTGAACAGTTAGAAAAAAACGGAAAAGTGACGCGTCCAGCTTTGGGAATCCAGATGGTTAATTTA 
TCTAATGTGAGTACAAGCGACATCAGAAGACTCAATATTCCAAGTAATGTTACATCTGGTGTAATTGTTCGTTCGG 
TACAAAGTAATATGCCTGCCAATGGT CACC TTGAAAAATACGATGTAATTACAAAAGTAGATGACAAAGAGATTG 

60 CTTCATCAACAGACTTACAAAGTGCTCrrTACAACCATTCTATCGGAGACACCATTAAGATAACCTACTATCGTAA 
CGGGAAAGAAGAAACTACCTCTATCAAACrTAACAAGAGTTCAGGTGATTTAGAATCTTAA 

MEANMKHLKTFYKKWFQLLVVIVISFFSGALGSFSITQLTQKSSVNNSNNNSTITQTAYKNENSTTQAVNKVKDAVVSV 
ITYSANRQNSVFGNDDTDTDSQR1SSEGSGVIYKKNDKEAYIVTNNHVINGASKVDIRLSDGTKVPGEIVGADTFSD1AV 
65 VKISSEKVTTVAEFGDSSKLTVGETAlAlGSPLGSEYAm*VTQGIVSSLNRNVSLKSEDGQAISTKAIQTDTAINPGNSGGP 
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LINIQGQVIGrrSSKIATNGGTSVEGLGFAIPANDAINUIEOLEKNGKVTRPALGIQMVNLSNVSTSDIRRLNIPSNVTSGVlV 
RSVQSNMPANGHLEKYDVITKVDDKEIASSTDLQSALYNHSIGDT1KITYYRNGKEETTSIKLNKSSGDLESZ 

IP15 939bp 

5 

ATGGCAGAAATTTATCTAGCAGGTGGTTGTTTTTGGGGCCTAGAGGAATAl"ri"I"I CACGCATTTCTGGAGTGCTAG 
AAACCAGTGTTGGCTACGCTAATGGTCAAGTCGAAACGACCAATTACCAGTTGCTCAAGGAAACAGACCATGCAG 
AAACGGTCCAAGTGATTTACGATGAGAAGGAAGTGTCACTCAGAGAGATTTTACTTTATTATTTCCGAGTTATCGA 
TCCTCTATCTATCAATCAACAAGGGAATGACCGTGGTCGCCAATATCGAACTGGGATTTATTATCAGGATGAAGC 

10 AGATTTGCCAGCTATCTACACAGTGGTGCAGGAGCAGGAACGCATGCTGGGTCGAAAGATTGCAGTAGAAGTGGA 
G C A ATT ACG CC A CT A CATTCTG G CTG AAG A CT A CCACC AAG ACT ATCTC AG G AAG AATCCTTC AG GTT ACTGTC AT 
ATCGATGTGACCGATGCTGATAAGCCATTGATTGATGCAGCAAACTATGAAAAGCCTAGTCAAGAGGTGTTGAAG 
GCCA GTCT ATCTGAAGAGTCTTATCGTGTCACACAAGAAGCTGCTACAGAGGCTCCATTTACCAATGCCTATGACC 
AAACCTTTGAAGAGGGGATTTATGTAGATATTACGACAGGTGAGCCACTCrri'IM'IGCCAAGGATAAGTrTGCTTC 

15 AGGTTGTGGTTGGCCAAGTTTTAGCCGTCCGATTTCCAAAGAGTTGATTCATTATTACAAGGATCTGAGCCATGGA 
ATGGAGCGAATTGAAGTTCGTTCTCGTTCAGGCAGTGCTCACrTGGGTCATGTTTTCACAGATGGACCGCGGGAGT 
TAGGCGGCCTCCGTTACTGTATCAATTCTGC i I CI 11 A CG CTTTGTG G CCA AG G ATG AG ATGG A A AA AGC AG GAT A 
TGGCTATCTATTGCCTTACTTAAACAAATAA 

20 MAEIYLAGGCFWGLEEYFSRISGVLETSVGYANGQVETTNYQLLKETDHAETVOVIYDEKEVSLREILLYYFRVIDPLSI 
NQQGNDRGRQYRTGrV^QDEADLPAIYTVVQEQERMLGRKIAVEVEQLRHYlLAEDYHQDYLRKNPSGYCHIDVTDA 
DKPLlDAANYEKPSQEVLKASLSEESYRVTQEAATEAPFTNAYDQTFEEGIYVDnTGEPLFFAKDKFASGCGWPSFSRPI 
SKELIHYYKDl^HGMERIEVRSRSGSAHLGHVFTDGPRELGGLRYCINSASLRPVAKDEMEKAGYGYLLPYLMCZ 

25 ID17 870bp 



ATGAAGATTATTGTACCTGCAACCAGTGCCAATATCGGGCCAGGTTTTGACTCGGTCGGTGTAGCTGTAACCAAGT 
ATCTTCAAATTGAGGTCTGCGAAGAAC GAGA TGAGTGGCTGATTGAACACCAGATTGGCAAATGGATTCCACATG 
A CG AG CGT A ATCT CTTG CTC A A A ATCG CTTTG C A AATTGT A CC AG A CTTG C AA CC AAG A CG CTTG AAA ATG ACC A 
30 GTGATGTCCCTTTGGCGCGCGGTTTGGGTTCTTCCAGCTCGGTTATCGTTGCTGGGATTGAACTAGCCAACCAACT 
G GGTC A ACTC A A CTT ATC AG A CC ATG A A AAATTG C AGTT AG CG A CC A AG ATTG AAG GG C ATCCTG A C A ATGTGG C 
TCCAGCCA TTTA TGGTAATCTCGTTATTGCAAGTTCTGTTGAAGGGCAAGTCTCTGCTATCGTAGCAGACTTTCCA 
GAGTGTGATTTTCTAGCTTACATTCCAAACTATGAATTACGTACTCGCGACAGCCGT^ 

TGTCTTATAAGGAAGCrGTTGCTGCAAGTTCTATCGCCAATGTAGCGGTTGCTGCCTTGTTGGCAGGAGACATGGT 
35 GACCGCTGGGCAAGCAATCGAGGGAGACCTCTTCCATGAGCGCTATCGTCAGGACTTGGTAAGAGAATTTGCGAT 
GATTAAGCAAGTGACCAAAGAAAATGGGGCCTATGCAACCTACCTTTCTGGTGCTGGGCCGACAGTTATGGTTCT 
GGCTTCTCATGACAAGATGCCAACAATTAAGGCAGAATTGGAAAAGCAACCTTTCAAAGGAAAACTGCATGACTT 
GAGAGTTGATACCCAAGGTGTCCGTGTAGAAGCAAAATAA 

40 MKIIVPATSANIGPGFDSVGVAVTKYLQIEVCEERDEWLIEHQIGKWIPHDERNLLLKIALQIVPDLQPRRLKMTSDVPLA 
RGLGSSSSVIVAGIELANQLGQLNLSDHEKLQLATKIEGHPDNVAPAIYGNLV1ASSVEGQVSAIVADFPECDFLAYIPNY 
ELRTRDSRSVLPKKLSYKEAVAASS1ANVAVAALLAGDMVTAGQAIEGDLFHERYRQDLVREFAMIKQVTKENGAYAT 
YLSGAGPTVMVLASHDKMPTIKAELEKQPFKGKLHDLRVDTQGVRVEAKZ 

45 ID20 564bo 

ATGAAATATCACGATTACATCTGGGATTTAGGTGGAACTTTACTGGATAATTATGAAACTTCA^ 
TTGAAACATTGGCACTGTATGGTATCACACAAGACCATGACAGTGTCTATCAAGCTTTAA 

TGCGATTG AGAC ATTCGCTCCCAATTTAGAGAATTTTTTAGAAAAGTACAAGGAAAATGAAGCCAGAGAGCTTGA 
DU ACACCCGATTTTATTTGAA GGAG TTTCTGACCTATTGGAAGACATTTCAAATCAAGGTGGCCGTCATTTTTTGGTC 
TCTCATCGAA ATGA TCAGGTTTTGGAAATTTTAGAAAAAACCrCTATAGCAGCTTATTTTACAGAAGTGGTGACTT 
CTAGCTCAGGCTTTAAGAGAAAGCCAAATCCCGAATCCATGCTTTATTTAAGAGAAAAGTATCAGATTAGCTCTG 
GTCTTGTCATTGGTGATCGGCCGATTGATATCGAAGCAGGTCAAGCTGCAGGACTTGATACCCACTTGTTTACCAG 
T ATCGTG A ATTT A AG AC AAGT ATT AG AC AT AT A A 

MKYHDYIWDLGGTLLDNYETSTAAFVETLALYGITODHDSVYQALKVSTPFAIETFAPNLENFLEKYKENEARELEHPI 
LFEGVSDLLEDISNQGGRHFLVSHRNDQVLEILEKTSIAAYFTEVVTSSSGFKRKPNPESMLYLREKYQISSGLVIGDRPID 
IEAGQAAGLDTHLFTSIVNLRQVLDIZ 

60 ID21 1875 bo 

ATGACAGAAGAAATCAAAAATCTGCAGGCACAGGATTATGATGCCAGTCAAATTCAAGTTTTAGAGGGCTTAGAG 
GCTGTTCGTATGCGTCCAGGGATGTACATTGG ATC AACCTCAA AAG AAGGTCTTCACCATCTAGTCTGGG AAATTG 
TTGATAACTCAATTGACGAGGCCTTGGCAGGATTTGCCAGCCATATTCAAGTTTTTATTGAGCCAGATGATTCGAT 
CO TACTGTTGTGGATGATGGGCGTGGTATCCCAGTCGATATTCAGGAAAAAACAGGCCGTCCTGCTGTTGAGACCGT 
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CTTTACAGTCCTTCACGCTGGAGGAAAGTTCGGCGGTGGTGGATACAAGGTTTCAGGTGGTCTTCACGGGGTGGG 
GTCGTCAGTAGTTAATGCCCTTTCCACTCAATTAGACGTTCATGTTCACAAAAATGGTAAGATTCATTACCAAGAA 
TACCGTCGTGGTCATGTTGTCGCAGATCTTGAAATAGTTGGAGATACGGATAAAACAGGAACAACTGTTCACTTC 
AC A CCG G ACCC AAA A ATCTTC ACTG AAAC AAC AATCTTTG ATTTTG AT AAATT AAAT AAACG G ATTC AAG AGTTG 
5 GCCTTTCTAAATCGCGGTCTTCAAATTTCAATTACAGATAA 

ATGAAGGTGGGATTGCTAGTTACGTTGAATATATCAACGAGAACAAGGATGTAATCTTTGATACACCAATCTATA 
CAGACGGTGAGATGGATGATATCACAGTTGAGGTAGCCATGCAGTACACAACTGGTTACCATGAAAATGTCATGA 
GTTTCGCCAATAATATTCATACCCATGAAGGTGGAACACATGAACAAGGTTTCCGTACAGCCTTGACACGTGTTAT 
CAACGATTATGCTCGTAAAAATAAGTTACTGAAAGACAATGAAGATAATTTAACAGGGGAAGATGTTCGCGAAGG 

10 CTTAACTGCAGTTATCTCAGTTAAj\CACCCAAATCCAC^ 

CGAAGTGGTCAAGATTACCAATCGCCTCTTCAGTGAAGCTTTCTCCGATTTCCTCATGGAAAATCCACAGATTGCC 
AAACGTATCGTAGAAAAAGGAATTTTGGCTGCCAAGGCTCGTGTGGCTGCCAAGCGTGCGCGTGAAGTCACACGT 
AAAAAATCTGGTTTGGAAATTTCCAACCTTCCAGGGAAACTAGCAGACTGTTCTTCTAATAACCCTGCTG 
AACTCTTCATCGTCGAAGGAGACTCAGCTGGTGGATCAGCCAAATCTGGTCGTAACCGTGAGTTTCAGGCTATCCT 

15 TCCAATTCGCGGTAAGATTTTGAACGTTGAAAAAGCAAGTATGGATAAGATTCTAGCCAACGAAGAAATTCGTAG 
T C ' l ' l ' I ' l CACAGCCATGGGAACAGGATTTGGCGCAGAATTTGATGTTTCGAAAGCCCGTTACCAAAAACTC U ' rrrr G 
ATGACCGATGCCGATGTCGATGGAGCCCACATTCGTACCCriClU'JnAACCTTGATTTATCGTTATATGAAACCAA 
TCCT AG AAG CTGGTT ATGTTT AT ATTG CCC A ACC ACC A ATCT ATGGTGTC AAGGTTGG AAG CG AG ATT A AAG A AT A 
TATCCAGCCGGGTGCAGATCAAGAAATCAAACTCCAAGAAGCTTTAGCCCGTTATAGTGAAGGTCGTACCAAACC 

20 GACTATTCAGCGTTATAAGGGGCTAGGTGAAATGGACGATCATCAGCTGTGGGAAACAACCATGGATCCCGAACA 
TCGCTTGATGGCTAGAGTTTCTGTAGATGATGTGCAGAAGCAGATAAAATCTTTGATATGTTGA 

NfTEEIKNLQAQDYDASQIQVLEGLEAVRMIU>GMYIGSTSKEGLHHLVWEIVDNSIDEALAGFASHIQVFIEPDDSrrVVD 
DGRGIPVDIQEKTGRPAVETVFTVLHAGGKFGGGGYKVSGGLHGVGSSVVNALSTQLDVHVHKNGKIHYQEYRRGHV 

25 VADLEIVGDTDKTGTTVHFTPDPKIFTETTIFDFDKLNKRIQELAFLNRGLQISITDKRQGLEQTKHYHYEGGIASYVEYI 
NENKDVIFDTPIYTDGEMDDrrVEVAMQYTTGYHEWMSFANNIHTHEGGTHEQGFRTALTRVINDYARKNKLLKDN 
EDNLTGEDVREGLTAVISVKHPNPQFEGQTKTTCLGNSEVVKrTNRLFSEAFSDFLMENPQIAKRIVEKGILAAKARVAAK 
RAREVTRKKSGLEISNLPGKLADCSSNNPAETELFIVEGDSAGGSAKSGRNREFQAILPIRGKILNVEKASMDKILANEEI 
RSLFTAMGTGFGAEFDVSKARYOKLVLMTDADVDGAHIRTLLLTLIYRYMKPILEAGYVYIAQPPIYGVKVGSEIKEYI 

30 QPGADQEIKLQEALARYSEGRTKPTIQRYKGLGEMDDHQLWETTMDPEHRLMARVSVDDVQKQIKSLIC2 

IDS4 1446bp 

ATGAGTAGACGTTrrAAAAAATCACGTTCACAGAAAGTGAAGCGAAGTGTTAATATAGTTTTGCTGACTATTTA^ 
35 TATTGTTAGTTrGTTTTTTATTGTTCTTAATCTrrAAGTACAATATCCTTGC 

CTGCGTTAGTCCTACTAGTTGCCTTGGTAGGGCTACTCTTGATTATCTATAAAAAAGCTGAAAAGTTTACT 
CTGTTGGTGTTCTCTATCCrrGTCAGCTCTGTGTCGCTCTTTGCAGTACAGCA 

AAATGCGACTTCTAATTACTCAGAATATTCAATCAGTGTCGCTGTTTTAGCAGATAGTGAGATCGAAAATGTTACG 
CAACTGACGAGTGTGACAGCACCGACTGGGACTAATAATGAAAATATTCAGAAATTACTAGCTGATATCAAGTCA 

40 AGTCAGAATACCGATTTGACGGTCAACCAGAGTTCGTCTTACTTGGCAGCTTACAAGAGTTTGATTGCAGGGGAG 
ACTAAGGCCATTGTCCTAAATAGTGTCTTTGAAAACATCATCGAGTCAGAGTATCCAGACTACGCATCGAAGATA 
AAAAAGATTTATACTAAGGGATTCACTAAAAAAGTAGAAGCTCCTAAGACGTCTAAGAGTCAGTCTTTCAATATC 
TATGTTAGTGGAATTGACACCTATGGTCCTATTAGTTCGGTGTCGCGATCAGATGTCAACATCCTGATGACTGTCA 
ATCGAGATACCAAGAAAATCCTCTTGACCACAACGCCACGTGATGCCTATGTACCAATCGCAGATGGTGGAAATA 

45 ATCAAAAAGATAAATTGACTCATGCGGGCATTTATGGAGTTGATTCGTCCATTCACACCTTAGAAAATCTCTATGG 
AGTGGATATCAATTACTATGTGCGATTGAACITCACTTCGTTTTTGAAATTGATTGATTTGTTGGGTGG 
TTTATAATGATCAAGAATTTACTGCCCATACGAATGGAAAGTATTACCCTGCAGGCAATGTTCATCTTGATTCAGA 
ACAGGCTCTCGGTTTTGTTCGTGAGCGCTACTCCCTAGCAGATGGCGATCGTGACCGCGGGCGCCATCAACAAAA 
G GTG ATTGTGG CT ATCCTTC AAA AATT A A CGTC A A CCG A AGTG CTG A AA AATT AT AGT A CG ATC ATT A AT AG CTTG 

50 CAAGATTCTATCCAAACAAATATGCCACTTGAGACCATGATAAATTTGGTCAATGCTCAGTTAGAAAGTGGAGGG 
A ATT AT AA AGT A A ATTCTC A AG ATTT A A A AGG G A C AGGTCG G ATG G ATCTTCCTTCTT ATG C A ATG CC AG AC AGT A 

ACCTCTATGTGATGGAAATAGATGATAGTAGTTTAGCTGTAGTTAAAGCAGCTATACAGGATGTGATGGAGGGTA 
GATGA 

55 MSRRFKKSRSQKVKRSVNIVLLTIYLLLVCFLLFLIFKYN1LAFRYLNLVVTALVLLVALVGLLLIIYKKAEKFTIFLLVFS 
ILVSSVSLFAVQQFVGLTNRLNATSNYSEYSISVAVLADSEIENVTQLTSVTAPTGTNNENIQKLLADIKSSQNTDLTVNQ 
SSSYLAAYKSLIAGETKAIVLNSVFENIIESEYPDYASKIKKIYTKGFTKKVEAPKTSKSQSFNIYVSGIDTYGPISSVSRSDV 
NILMTVNRDTKKILLTTTPRDAYVPIADGGNNQKDKLTHAGIYGVDSSIHTLENLYGVDINYYVRLNFTSFLKLIDLLGG 
IDVYNDQEFTAHTNGKYYPAGNVHLDSEQALGFVRERYSLADGDRDRGRHQQKVIVAILQKLTSTEVLKNYSTIINSLO 

00 DSIQTNMPLETMINLVNAQLESGGNYKVNSQDLKGTGRMDLPSYAMPDSNLYVMEIDDSSLAVVKAAIQDVMEGRZ 
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ATGATAGACATCCATTCGCATATCGTTTTTGATGTAGATGACGGTCCCAAGTCAAGAGAGGAAAGCAAGGCTCTC 
TTGGCAGAATCCTACAGACAGGGGGTGCGAACCATTGTTTCTACCTCTCACCGTCGCAAGGGCATGTTTGAAACTC 
CGGAAGAGAAGATAGCAGAAAACrilCriCAGGTTCGGGAAATAGCTAAGGAAGTGGCGAGTGACTTGGTCATTG 
CTTACGGGGCTGAAATTTATTACACACCAGATGTTCTGGATAAGCTGGAAAAAAAGCGGATTCCGACCCTCAATG 
5 ATAGTCGTTATGCCnTGATAGAGTTTAGTATGAACACTCCTTATCGCGATATTCATAGCGCCTTGAGCAAGATCTT 
GATGTTGGGAATTACTCCAGTCATTGCCCACATTGAGCGCTATGATGCTCITGAAAATAATGAAAAuACGCGTTCGA 
GAACTGATCGATATGGGCTGTTACACGCAAGT^ATAGTTCACATG 

ATAAATTCATGAAAAAAAGAGCTCAGTATTTTTTAGAGCAGGATTTGGTTCATGTCATTGCAAGTGATATGCACAA 
TCTA GACGGT AGACCTCCTCATATGGCAGAAGCATATGACCTTGTTACCCAAAAATACGGAGAAGCGAAGGCTCA 
1 0 GG AACTTTTTATAG ACAATCCTCG AAAA ATTGTAATGG ATC AACTAATTTAG 

MIDIHSHIVFDVDDGPKSREESKALLAESYRQGVRTIVSTSHRIUCGMFETPEEKIAENFLQVRE1AKEVASDLVIAYGA 
YYTPDVLDKLEKKRIPTLNDSRYALIEFSMNrrPYRDIHSAl^KILMLGrTPVIAHIERYDALENNEKRVRELIDMGCYTQV 
NSSHVLKPKLFGERYKFMKKRAQYFLEQDLVHV1ASDMHNLDGRPPHMAEAYDLVTQKYGEAKAQELFIDNPRKIVM 
15 DQLIZ 

ID58 3990bo 

TTGA TTTATATAATCGCTATCAATATAACAATGCAATCAGGAGGTTTTGCAATGAAACATGAAAAACAACAGCGT 
20 TTTTCTATTCGTAAATACGCTCTAGGAGCAGCTTC 

CCGATGGAGTTACTCCTACrACTACAGAAAACCAACCGACCATCCATACGGTTTCTGATTCCCCTCAATCATCCGA 
AAATCGGACTGAGGAAACACCTAAAGCAGTGCTTCAACCAGAAGCTCCAAAAACTGTAGAAACAGAAACTCCAG 
CTACTGATAAGGTAGCTAGTCTTCCAAAAACAGAAGAAAAACCACAAGAGGAAGTTAGTTCAACTCCTAGTGATA 
. AAGCAGAAGTGGTAACTCCAACTTCTGCTGAAAAAGAAACTGCTAATAAAAAGGCAGAAGAAGCT 
25 AAGGAAGAAGCGAAAGAGGTTGATTCTAAAGAGTCAAATACAGACAAGACTGACAAGGATAAACCAGCTAAAAA 
AGATGAAGCGAAAGCAGAGGCTGACAAACCGGCAACAGAGGCAGGAAAGGAACGTGCTGCAACTGTAAATGAAA 
AACTAGCGAAAAAGAAAATTGTTTCTATTGATGCTGGACGTAAATATTTCTCACCAGAACAGCT 
TCGATAAAGCGAAACATTATGGCTACACTGATTTACACCTATTAGTCGGAAATGATGGACTCCGTTTCATGTTGGA 
CGATATGAGCATCACAGCTAACGGCAAGACCTATGCCAGTGACGATGTCAAACGCGCCATTGAAAAAGGTACAAA 
30 TGATTATTACAACGATCCAAACGGCAATCACITAACAGAAAGTCAAATGACAGATCTGATTAACTATGCCAAAGA 
TAAAGGTATCGGTCTCAT TCCGA CAGT AAAT AGTCCTGGACACATGGATGCGATTCTCAATGCCATGAAAGAATT 
GGGA ATCCAA AACCCTAACnTTAGCTATTTTGGGAAGAAATCAGCCCGTACTGTCGATCTTGACAACGAACAAGC 
TGTCGCIUUM'ACAAAAGCCCTTATCGACAAGTATGCTGCTTATTTCGCGAAAAAGACTGAAATCTTCAACATCGGA 
CTTGATGAATATGCCAATGATGCGACAGATGCTAAAGGTTGGAGTGTGCTTCAAGCTGATAAATACTATCCAAAC 
3D GAAGGCTACCCTGTAAAAGGCTATGAAAAATTTATTGCCTACGCCAATGACCTCGCTCGTA 

GGTCTCAAACCAATGGCTrTTAACGACGGTATCTACTACAATAGCGACACAAGCTTTGGTAGTTTTGACAAAGAC 
ATCATCGTTTCTATGTGGACTGGTGGTTGGGGAGGCTACGATGTCGCTTCTTCTAAACTACTAGCT 
ACCAAATCCTTAATACCAATGATGCITGGTACTACGTTCTTGGACGAAACGCrGATGGCCAAGGCTGGTACAATCT 
CGATCAGGGGCTCAATGGTATTAAAAACACACCAATCACTTCTGTACCAAAAACAGAAGGAGCTGATATCCCAAT 
40 CATCGGTGGTATGGTAGCTGCTTGGGCTGACACTCCATCTGCACGTTATTCACCATCACGCCT 

CGTCATTTTGCAAATGCCAACGCTGAATACTTCGCAGCTGATTATGAATCTGCAGAGCAAGCACTTAACGAGGTA 
CC AA A AG A CCTG A ACCGTT AT ACTG C AG A A AG CGTC ACGG CCGT AA A AG A AG CTG A AAA AG CT ATTCG CTCTCTC 
GATAGCAACCTTAGCCGTGCCCAACAAGATACGATTGATCAAGCCATTGCTAAACrrCAAGAAACTGTCAACAAC 
TTGACCCTCACGCCTGAAGCTCAAAAAGAAGAAGAAGCTAAACGTGAGGTTGAAAAACTTGCCAAAAACAAGGT 
45 AATCTCAATCGATGCTGGACGCAAATACTTTACTCTGAACCAGCTCAAACGCATCGTAGACAAGGCCAGTGAGCT 
CGGATATTCTGATGTCCATCTCCTTCTAGGAAATGACGGACTTCGCrrrCTACTCGATGATATGACCATTACTGC 
AACGGAAAAACCTATGCTAGTGATGACGTTAAAAAAGCTATTATCGAAGGAACTAAAGCTTACTACGACGATCCA 
AACGGTACTGCACTAACACAGGCAGAAGTAACAGAGCTAATTGAATACGCTAAATCTAAGGACATCGGTCTCATC 
CCAGCTATTAACAGTCCAGGTCACATGGATGCTATGCTGGTTGCCATGGAAAAATTAGGTATTAAAAATCCTCAA 
OU GCCCACTTTGATAAAGTTTC AAAA ACAACTATGGACTTGAAAAACGAAGAAGCGATGAACTTTGTAAAAGCCCTC 
ATCGGTAAATACATGGACTTCTTTGCAGGTAAAACAAAGATTTTCAACTTTGGTACTGACGAATACGCCAACGAT 
GCGACTAGTGCCCAAGGCTGGTACTACCTCAAGTGGTATCAACTCTATGGCAAATTTGCCGAATATGCCAACACC 
CTCGCAGCTATGGCCAAAGAAAGAGGGCTTCAACCAATGGCCTTCAACGATGGCTTCTACTATGAAGACAAGGAC 
GATGTTCAGTTTGACAAAGATGTCTTGATTTCTTACTGGTCTAAAGGCTGGTGGGGATATAACCTCGCATCACCT 
J J AATACCTAGCAAGCAAAGGCTATAAATTCTTGAATACCAACGGTGACTGGTACTACATTCTTGGTCAAAAACCAG 
AAGATGGTGGTGGTTTCCTCAAGAAAGCTATTGAGAATACTGGAAAAACACCATTCAATCAACTAGCTTCTACCA 
AAT ATCCTG A AGT AG ATC TTCC A AC AGTCO G A AGT ATG CTTTC AATCTGG G C AG AT AG ACC AAG CG CTG AAT AC A 
AG G A AG AG G A A ATCTTTG AACTC ATG ACTG CCTTTG C AG A CC A C A AC A AAG A CT A CTTTCGTG CT AATT AT AATG 
CTCTCCG CG A AG A ATT AGCT AAA ATTCCT AC AA ACTT AG A AG G AT AT AG T AA AG AA AGTCTTG AGG CCCTTG ACG 
CAGCTAAAACAGCTCTAuAATTACAACCTCAACCGTAATAAACAAGCTGAGCTTGACACGCTTGTAGCCAACCTAA 
AAGCCGCTCTTCAAGGCCrCAAACCAGCrGTAACrCATTCAGGAAGCCTAGATGAAAATGAAGTGGCrGCCAATG 
TTGAAACCAGACCAGAACTCATCACAAGAACTGAAGAAATTCCATTTGAAGTTATCAAGAAAGAAAATCCTAACC 
TCCCAGCCGGTCAGGAAAATATTATCACAGCAGGAGTCAAAGGTGAACGAACTCATTACATCTCTGTACTCACTG 
AAAATGGAAAAACAACAGAAACAGTCCTTGATAGCCAGGTAACCAAAGAAGTTATAAACCAAGTGGTTGAAGTT 
GGCGCTCCTGTAACrCACAAGGGTGATGAAAGTGGTCTTGCACCAACTACTGAGGTAAAACCTAGACTGGATATC 
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CAAGAAGAAGAAATTCCATTTACCACAGTGACTTC 
ACTAAGGGCGTCAATGGACATCGTAGCAACTTCTACTCTGTGA 

CTTGTAAATAGTGTCGTAGCACAGGAAGCCGTTACTCAAATAGTCGAAGTCGGAACTATGGTAACACATGTAGGC 
GATGAAAACGGACAAGCCGCTATTGCTGAAGAAAAACCAAAACTAGAAATCCCAAGCCAACCAGCTCCATCAAC 
5 TGCTCCTGCTGAGGAAAGCAAAGTTCTTCCTCAAGATCCAGCTCCTGTGGTAACAGAGAAAAAACTTCCT 

AGGAACTCACGATTCTGCAGGACTAGTAGTCGCAGGACrCATGTCCACACTAGCAGCCTATGGACrCACTAAAAG 
A AAAG A AG ACT AA 

MIYIIAINrrMQSGGFAMKHEKQQRFSIRKYAVGAASVLIGFAFQAQTVAAIXj\aTTTTENQP^ 
10 TPKAVLQPEAPKTVETETPATDKVASLPKTEEKPQEEVSSTPSDKAEVVTPTSAEK 
ShTTDKTDKDKPAKKDEAKAEADKPATEAGKERAATVNEiaAKK^ 
VG^QX}LRFMLDDMSITANGKTYASDDVKRAIEKGTNDYY^ro^ 

AILNAMKELGIONPNFSYFGKKSARTVDLDNEQAVAFnCALIDKYAAYFAKKTEIFNIGLDEYANDATDAKGWSVLQA 
DKYYPNEGYPVKGYEKFlAYANDLARIVKSHGLKPMAFNDGrYYNSDTSFGSFDKDirVSMWTGGWGGYDVASSKLLJ\ 

15 EKGHQILOTNDAWYYVLGRNAIX3QG\*rYNLIXKjLNGK 

MRHFANANAEYFAADYESAEQALNEVPKDLNRYTAESVTAVKEAEKAIRSLDSN1-SRAQQDTIDQAIAKLQETVNNLT 
LTPEAQKEEEAKI^VEKLAKhnCViSIDAGRKYFTLNQLKRIVDKASELGYSDVHLIXGNDGLRFLLDDMTITANGKTYA 
SDDVKKAIIEGTKAYYDDPNGTALTQAEVTELIEYAKSKDIGLIPAINSPGHMDAMLVAMEKLGIKNPQAHFDKVSKTT 
MDLKNEEAMNFVKALIGKYMDFFAGKTKIFNFGTDEYANDATSAQGWYYUCWYQLYGKFAEYANTLAAMAKERGL 

20 QPMAFNDGFYYEDKDDVQFDKDVLISYWSKGWWGYNLASPQYLASKGYKFLNTNGDWYYILGQKPEDGGGFLKKAI 
ENTGKTPFNQI^TKYPEVDLPTVGSML^IWADRPSAEYKEEEIFELMTAFADHNKBYFRANYNAL^ 
YSKESLEALDAAKTALNYNLNRNKQAELDTLVANLKAALQGLKPAVTHSGSLDENEVAANVETRPELITRTEEIPFEVI 
KKENPNLPAGQENIITAGVKGERTHYISVLTENGKTTETVLDSOVTKEVINQVVEVGAPVTHKGDESGLAPTTEVKPRL 
DIOEEOPFTTVTCENPLLLKGKTQVITKGVNGHRSNFYSVSTSADGKEVKTLVNSVVAQEA\nrQIVEVGTMVTHVGDE 

25 NGQAAIAEEKPKLEIPSQPAPSTAPAEESKVLPQDPAPVVTEKKLPETGTHDSAGLWAGLMSTLAAYGLTKRKEDZ 

ID122 825bp 

ATGAACAAAAAAACAAGACAGACACTAATCGGACTGCTAGTGTTATTGCTTTTGTCTACAGGGAGCTATTAT^ 
30 AAGCAGATGCCGTCGGCACCTAATAGTCCCAAAACCAATCTTAGTCAGAAAAAACAAGCGTCTGAAGCTCCTAGT 
CAAGC ATTGG CAGAGAGTGTCTTAACAGACGCAGTCAAGAGTCAAATAAAGGGGAGTCTGGAGTGGAATGGCTC 
AGGTGCTTTTATCGTCAATGGTAATAAAACAAATCTAGATGCCAAGGTTTCAAGTAAGCCCTACGCTGACAATAA 
AACAAAGACAGTGGGCAAGGAAACTGTTCCAACCGTAGCTAATGCCCTCTTGTCTAAGGCCACTCGTCAGTACAA 
GAATCGTAAAGAAACTGGGAATGGTTCAACTTCTTGGACTCCTCCAGGTTGGCATCAGGTCAAGAATCTAAAGGG 
35 CTCTTATACCCATGCAGTCGATAGAGGTCATTTGTTAGGCTATGCCTTAATCGGTGGTTTGGATGGTTT^ 

CAACAAGCAATCCTAAAAACATTGGTGTTCAGACAGCCTGGGCAAATCAGGCACAAGCCGAGTATTCGACTGGTC 
AAA A CT ACT ATG AAAG C AAG GTG CGT A AAG CCTTG G A CC A AAA C AAG CGTGTCCGTT ACCGTGT AA CC CTTT A CT 
ACGCITCAAACGAGGATTTAGTTCCCTCAGCTTCACAGATTGAAGCCAAGTCTrCGGATGGAGAATrGGAATTCA 
ATGTTCTAGTTCCCAATGTTCAAAAGGGACrrCAACTGGATTACCGAACTGGAGAAGTAACTGTAACTCAGTAA 

40 

MNKKTRQTLIGLLVLLU-STGSYYIKQMPSAPNSPKTNLSQKKQASEAPSQALAESVLTDAVKSQIKGSLEWNGSGAFIV 
NGNKTNLDAKVSSKPYAD^TKTVGKETVPTVANALLSKATRQYKNRKETGNGSTSWTPPGWHQVKNLKGSYTHAV 
DRGHLLGYALIGGLDGFDASTSNPKNIAVQTAWANQAQAEYSTGQNYYESKVRKALDQNKRVRYRVTLYYASNEDL 
VPSASQIEAKSSDGELEFNVLVPNVQKGLQLDYRTGEVTVTQZ 

45 

ID123 225bp 

GTGCTAAGATTCAGCGGA TTGAG GCAAGTGATGAAGATGAATAAGAAATCAAGCTACGTAGTCAAGCGTTTACTT 
TTAGTCATCATAGTACTGATTTTAGGTACTCTGGCTCrAGGAATCGGTTTAATGGTAGGTTATGGAATCTTGGGCA 
50 AGGGTCAAGATCCATGGGCTATCCTGTCTCCAGCAAAATGGCAGGAATTGATTCATAAATTTACAGGAAATTAG 

VLRFSGLRQVMKMNKKSSYVVKRLLLVIIVLILGTLALGIGLMVGYGILGKGQDPWAILSPAKWQELIHKFTGNZ 
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CLAIMS: 

1 . A Streptococcus pneumoniae protein or polypeptide having a sequence 
selected from those shown in table 1 . 

2. A Streptococcus pneumoniae protein or polypeptide having a sequence 
selected from those shown in table 2. 

3. A protein or polypeptide as claimed in claim 1 or claim 2 provided in 
substantially pure form. 

4. A protein or polypeptide which is substantially identical to one defined in any 
one of claims 1 to 3. 

5. A homologue or derivative of a protein or polypeptide as defined in any one of 
claims 1 to 4. 

6. An antigenic and/or immunogenic fragment of a protein or polypeptide as 
defined 

in Tables 1-3. 

7. A nucleic acid molecule comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 1 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 
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(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which is substantially identical with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 1 . 

8. A nucleic acid molecule comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 2 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which is substantially identical with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 2. 

9. The use of a protein or polypeptide having a sequence selected from those 
shown in Tables 1-3, or homologues, derivatives and/or fragments thereof, as an 
immunogen and/or antigen. 



SUBSTITUTE SHEET (RULE 26) 



WO 00/06738 



PCT/GB99/02452 



71 

10. An immunogenic and/or antigenic composition comprising one or more 
proteins or polypeptides selected from those whose sequences are shown in Tables 1- 
3, or homologues or derivatives thereof, and/or fragments of any of these. 

11. An immunogenic and/or antigenic composition as claimed in claim 10 which 
is a vaccine or is for use in a diagnostic assay. 

12. A vaccine as claimed in claim 1 1 which comprises one or more additional 
components selected from excipients, diluents, adjuvants or the like. 

13. A vaccine composition comprising one or more nucleic acid sequences as 
defined in Tables 1-3. 

14. A method for the detection/diagnosis of S. pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one protein or 
polypeptide as defined in Tables 1-3, or homologue, derivative or fragment thereof. 

15. An antibody capable of binding to a protein or polypeptide as defined in 
Tables 1-3, or for a homologue, derivative or fragment thereof. 

16. An antibody as defined in claim 15 which is a monoclonal antibody. 

17. A method for the detection/diagnosis of S.pneumoniae which comprises the step 
of bringing into contact a sample to be tested and at least one antibody as define din 
claim 15 or claim 16. 

18. A method for the detection/diagnosis of S.pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one nucleic acid 
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sequence as defined in claim 7 or claim 8. 



19. A method of determining whether a protein or polypeptide as defined in 
Tables 1-3 represents a potential anti-microbial target which comprises inactivating 

5 said protein or polypeptide and determining whether S. pneumoniae is still viable. 

20. The use of an agent capable of antagonising, inhibiting or otherwise 
interfering with the function or expression of a protein or polypeptide as defined in 
Tables 1-3 in the manufacture of a medicament for use in the treatment or 

10 prophylaxis of S.pneumoniae infection 
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